Top Sources of Heap Garbage

In the last Xbox-related entry I posted, I mentioned how the CLR Profiler can be a very useful tool to know what are the biggest sources of heap garbage in your .NET project. I used it extensively in the past month to optimize my game to run on the Xbox, and here’s a rundown of my biggest programming “mistakes” (or problematic liberties?).

1. LINQ

Even the simplest LINQ queries like :

List<List<Potato>> potatoBags;
foreach (var potato in potatoBags.SelectMany(x => x))
{
  // ...
}

…will cause a noticeable amount of heap garbage. This includes Where clauses, OrderBy clauses, everything! It’s sad because I think that LINQ is a fantastic code-thinning tool, but it’s not an option on limited-memory systems like the Xbox.

That said, if you want to use LINQ at initialization/loading time, feel free to do so. The problems only arise in update/draw calls.

2. Automatic XNB Deserialization

At one point, I got sick or writing ContentTypeWriter and ContentTypeReader classes and started building an XNB automatic serializer and deserializer based on Alexander’s (John Doe?) work on the subject. On Windows, the load times remained the same and it greatly simplified or deleted many of my content pipeline classes.

But on the Xbox the load times were horrible. Even in Release and without the debugger attached, the load times were at least 5x as slow as on my PC. I then discovered that reflection calls generate a lot of heap garbage — and it’s not even clear if memory stays allocated or if it eventually gets compacted by the GC…

So I swallowed my pride and switched back to good old Reader/Writers. But hey, now the load times are super fast.

3. Using classes when you can use structs

Coming from a Java background, I’m very used to classes and I’ll use them for pretty much everything. Even data structures that get created at runtime, because it’s so handy to have references and everyone pointing to the same object…

Turns out using structs has a lot of advantages. Intuitively I thought that the by-copy parameter passing would just make everything slower, but if you keep your structures small enough it has little to no effect on performance. The fact that they reside on the stack and not on the heap makes them a much better option for the Xbox. So datatypes like collision result objects, object identifiers, anything that you need to create often when the game is running should be made into structs.

4. Object pools (are a good thing)

Sometimes you just need to dynamically allocate reference objects in your algorithms. Or even value-types can get allocated on the heap if they’re at class-local scope. But you can minimize the damage by using object pools!

They’re really easy to set up (I found this Ziggyware article to be a good starting point) and they’ll save you heap garbage by preallocating to the number of objects you’ll actually need, and extending the lifetime of objects that would otherwise be disposable trash.

5. Removing objects from a collection that you’re enumerating

I thought I had found a really good way to fix the old problem of “Collection was modified; enumeration operation may not execute” when you remove an object from a collection when you’re foreach’ing on it :

foreach (var potato in potatoBag.ToArray())
{
  if (potato.Expired)
    potatoBag.Remove(potato);
}

It’s pretty cute, no? Very little impact on the iteration code. But it also copies the whole collection to a brand new array everytime you’re enumerating it… :(

What I ended up doing instead to minimize garbage is :

// This is allocated once, at the class-level scope
readonly List<Potato> expiredPotatoes = new List<Potato>();

public void Update()
{
  // Standard update
  foreach (var potato in potatoBag)
  {
    if (potato.Expired)
      expiredPotatoes.Add(potato);
  }
  // Removing pass
  foreach (var expired in expiredPotatoes)
    potatoes.Remove(expired);
  expiredPotatoes.Clear();
}

It’s certainly heavier code-wise, but at least it’s clean. And it’s faster too.

6. Enums as Dictionary keys

If you use an enum as the TKey type parameter for a Dictionary<TKey, TValue> object, you’ll have a small amount of garbage generated everytime you access the dictionary. But there’s an easy way around it : you just need to build a Comparer class for the enum type (which is under 10 lines of code) and pass it to the constructor of your dictionary.

Cheers to Nick Gravelyn for pointing out a solution to that problem.

7. Collections should be pre-allocated

When possible, you should use the parameterized constructors of all your Lists, Dictionaries, HashSets and whatever other collection types that you use, such that their backing arrays are pre-allocated to the number of elements that you plan to add to them.

Starting them with the default parameterless constructor will force the collection to grow (using Array.Resize, which trashes the old array and creates a new, bigger one) until you filled it completely.

Conclusion

That’s it for now.
I know, 7 is a terrible number for a “Top N” list, but I can’t think of other major sources of garbage that I’ve encountered. The rest goes down to good programming practices. (don’t instantiate reference types all over the place, etc.)

Hope it helped!

10 thoughts on “Top Sources of Heap Garbage”

  1. About that Automatic XNB Deserialization vs Readers / Writers, XNA 3.1, coming soon, will have XNB Serialization/Deserialization built in, based on the IntermediateSerializer, and remove the need to write Readers and Writers (but you can still write them if you wish).
    But for now we don’t know if it will have the same garbage problems as the one made by John Doe. I think this also uses reflection so we’ll have to wait and see after 3.1 comes out.

  2. Nice, great tips! Another solution for modifying Lists while iterating over them is to just use for() rather than foreach.. of course you then need make sure to properly update the iterator :)

  3. 5. Removing objects from a collection that you’re enumerating

    Potato potato;
    for (i = potatoBag.Count – 1; i >= 0; i–)
    {
    if (potatoBag[i].Expired)
    potatoBag.RemoveAt(i);
    }

    Less code, faster, cleaner

  4. @brac : Yeah, but this only works if you’re working with an IList or an array. Also it changes the iteration order if you were doing other stuff in that loop.
    …and it gets rid of the delicious foreach. <3

  5. Well, your examples use lists. In that context my solution is absolutely valid.
    And since the main topic is heap trash it’s better than using a second list.

    If you dont want to revert the update order, you can still avoid heap trash
    by trading it for some cpu time:

    potatoBag.RemoveAll(p => p.Expired);

  6. I say “collection” and I never mentioned that potatoBag is a List… ;)

    The second list is never trashed, it’s allocated once and is never “trimmed” to 0-length, so it’s unlikely that it’ll generate trash.

    Your RemoveAll() solution uses an anonymous delegate which DOES create trash. It was one other big source of garbage in my project, I forgot to add it to the list actually… but it’s kinda related to LINQ (lots of anonymous delegates in LINQ queries).

  7. There’s also the difference between setting a public field vs property. About 3x faster to use fields.

Leave a Reply

Your email address will not be published. Required fields are marked *