All empty arrays are not the same instance in .NET

Working a little on my side project where performance is of the essence, I was looking at a method looking like this:

It got me thinking. Will .NET create a new empty array of the same type each call to new type[0]?

Let’s try it out in a simple console application. Something like this:

They are not the same!! Maybe this is on purpose or maybe they have not thought about it. Let’s have a look if the empty array use any memory.

The results are astonishing!

Memory used by empty arrays on x86: 12000352 – Thats about 11,5MB. Each empty array is about 12 bytes.
Memory used by empty arrays on x64: 23999968 – 22,8MB. Each empty array is about 24 bytes.

That’s quite some memory for empty arrays.

If you have a program that frequently create arrays, or especially empty arrays, you could save memory and reduce the number of garbage collections by using the same empty array instance.

This is an example of what an allocator class could look like. I’ve added a couple of overloads so the compiler do not have to convert the size type for you and you are able to allocate all the memory of an unsigned int64.

Changing the previous memory size check program to use our allocator would look like this:

The result is much better:

Memory used by empty arrays for x86: -8
Memory used by empty arrays for x64: -8

Now, the memory usage routine is not a 100% accurate, since we are not actually getting more memory by setting the 1 millon arrays, but it’s close enough to show us the result.

Final words

Using the ArrayAllocator<T>.New method all the time, even when you never allocate empty arrays adds unnecessary overhead. Use it only when your arrays can be of 0 size.

If you like in my case know that the array you are returning is always of 0 size, use the ArrayAllocator<T>.EmptyArray directly instead.

You do not need to use the ArrayAllocator at all. It’s just a suggestion :).

Cheers!

 

2 thoughts on “All empty arrays are not the same instance in .NET

  1. Jeff LeBert

    Very similar to this is Enumerable.Empty(). It will return an IEnumerable which is actually an empty array. There is only one instance of the array created for each type.

    I favor IEnumerable over arrays for a number of reasons. Specifically, the values in arrays can be modified and in most of my cases I never want the data to be modified. This is actually a problem in the .Net framework where Assembly.GetTypes() returns an array. That means it is possible for the caller to modify the array which means it cannot be reused and has to be rebuilt each time Assembly.GetTypes() is called.

    Reply
    1. Erik Bergman Post author

      Thanks for your comment, Jeff!

      I didn’t know about Enumerable.Empty. Thanks.

      I agree with you that IEnumerable can have it’s uses and Assembly.GetTypes() is an excellent example where the return type should not have been array.

      IEnumerable itself is not heavy, but when you call IEnumerable.GetEnumerator() a new enumerator is allocated on the heap. The standard Enumerator created when you call GetEnumerator() takes 32 bytes of memory. If you can save the memory of creating a new array by reusing an old one, it’s a gain.

      Some of the .NET framework class methods that take IEnumerable as input parameter will check if the IEnumerable is castable to ICollection or IReadOnlyCollection and use those interfaces instead since you are able to get the number of items from both and get item by index from ICollection without enumerating them. Quite often the caller of methods returning IEnumerable will want to check the number of items and/or get items by index.

      Looking through some random code, this is a common scenario, where a new array is created to get the number of items:

      IEnumerable items = GetIEnumerable().ToArray(); // or .ToList()
      var itemCount = items.Count;

      or

      IEnumerable items = GetIEnumerable().ToArray();
      var itemCount = items.Count(); // the GetEnumerable() is called and an enumerator is created and iterated through.

      If we use IReadOnlyColletion instead it would look like this:

      IReadOnlyCollection items = GetReadOnlyCollection();
      var itemCount = items.Count;

      IEnumerable is the most common return type for read only collection in the API:s, but I would encourage you to return IReadOnlyCollection or ICollection instead of IEnumerable whenever it’s possible without creating new objects or wasting CPU. If the callers of your API don’t care, they can iterate through the inherited IEnumerable from IReadOnlyColletion / ICollection, and the ones that do, can avoid casting to see if it’s implemented. You can return ICollection that has the ReadOnly property set to yes. The caller can iterate through it (with a for loop, that is) without allocating an enumerator but they can not change the contents of the collection.

      If most of the applications methods returns IEnumerable, a lot of enumerators will be allocated. Iterating through an IEnumerator is slower than accessing the contents directly or through ICollection. Unnecessary memory allocation will cause more garbage collections and the usage of wrapped arrays will make your application slower by executing more code. It does not make an extreme impact, but I would say that you should carefully consider method return types and when you in a simple way can enable yourself and your users to write faster applications without making them more complex, you should.

      The memory measurement code I used was this:

      It returns:
      IEnumerable memory consumption: 80 (Array size + 32 for the enumerator)
      Array consumption: 48


      using System;
      using System.Collections.Generic;

      using Console = System.Console;

      internal static class Program
      {
      private static IEnumerable GetEnumerable()
      {
      return new[] { 1, 2, 3, 4, 5, 6 };
      }

      private static int[] GetArray()
      {
      return new[] { 1, 2, 3, 4, 5, 6 };
      }

      public static void Main()
      {
      int sum = 0;

      {
      var totalMemory = GC.GetTotalMemory(true);

      foreach (var i in GetEnumerable())
      {
      sum += i;
      }

      var totalMemoryForEnumerator = GC.GetTotalMemory(true) - totalMemory;

      Console.WriteLine("IEnumerable memory consumption: " + totalMemoryForEnumerator);
      }

      {
      var totalMemory = GC.GetTotalMemory(true);

      foreach (var i in GetArray())
      {
      sum += i;
      }

      var totalMemoryForEnumerator = GC.GetTotalMemory(true) - totalMemory;

      Console.WriteLine("Array consumption: " + totalMemoryForEnumerator);
      }

      Console.Read();
      }
      }

      Thanks // Erik

      Reply

Leave a Reply

Your email address will not be published. Required fields are marked *