Top Sources of Heap Garbage

In the last Xbox-related entry I posted, I mentioned how the CLR Profiler can be a very useful tool to know what are the biggest sources of heap garbage in your .NET project. I used it extensively in the past month to optimize my game to run on the Xbox, and here’s a rundown of my biggest programming “mistakes” (or problematic liberties?).

1. LINQ

Even the simplest LINQ queries like :

List<List<Potato>> potatoBags;
foreach (var potato in potatoBags.SelectMany(x => x))
{
  // ...
}

…will cause a noticeable amount of heap garbage. This includes Where clauses, OrderBy clauses, everything! It’s sad because I think that LINQ is a fantastic code-thinning tool, but it’s not an option on limited-memory systems like the Xbox.

That said, if you want to use LINQ at initialization/loading time, feel free to do so. The problems only arise in update/draw calls.

2. Automatic XNB Deserialization

At one point, I got sick or writing ContentTypeWriter and ContentTypeReader classes and started building an XNB automatic serializer and deserializer based on Alexander’s (John Doe?) work on the subject. On Windows, the load times remained the same and it greatly simplified or deleted many of my content pipeline classes.

But on the Xbox the load times were horrible. Even in Release and without the debugger attached, the load times were at least 5x as slow as on my PC. I then discovered that reflection calls generate a lot of heap garbage — and it’s not even clear if memory stays allocated or if it eventually gets compacted by the GC…

So I swallowed my pride and switched back to good old Reader/Writers. But hey, now the load times are super fast.

3. Using classes when you can use structs

Coming from a Java background, I’m very used to classes and I’ll use them for pretty much everything. Even data structures that get created at runtime, because it’s so handy to have references and everyone pointing to the same object…

Turns out using structs has a lot of advantages. Intuitively I thought that the by-copy parameter passing would just make everything slower, but if you keep your structures small enough it has little to no effect on performance. The fact that they reside on the stack and not on the heap makes them a much better option for the Xbox. So datatypes like collision result objects, object identifiers, anything that you need to create often when the game is running should be made into structs.

4. Object pools (are a good thing)

Sometimes you just need to dynamically allocate reference objects in your algorithms. Or even value-types can get allocated on the heap if they’re at class-local scope. But you can minimize the damage by using object pools!

They’re really easy to set up (I found this Ziggyware article to be a good starting point) and they’ll save you heap garbage by preallocating to the number of objects you’ll actually need, and extending the lifetime of objects that would otherwise be disposable trash.

5. Removing objects from a collection that you’re enumerating

I thought I had found a really good way to fix the old problem of “Collection was modified; enumeration operation may not execute” when you remove an object from a collection when you’re foreach’ing on it :

foreach (var potato in potatoBag.ToArray())
{
  if (potato.Expired)
    potatoBag.Remove(potato);
}

It’s pretty cute, no? Very little impact on the iteration code. But it also copies the whole collection to a brand new array everytime you’re enumerating it… :(

What I ended up doing instead to minimize garbage is :

// This is allocated once, at the class-level scope
readonly List<Potato> expiredPotatoes = new List<Potato>();

public void Update()
{
  // Standard update
  foreach (var potato in potatoBag)
  {
    if (potato.Expired)
      expiredPotatoes.Add(potato);
  }
  // Removing pass
  foreach (var expired in expiredPotatoes)
    potatoes.Remove(expired);
  expiredPotatoes.Clear();
}

It’s certainly heavier code-wise, but at least it’s clean. And it’s faster too.

6. Enums as Dictionary keys

If you use an enum as the TKey type parameter for a Dictionary<TKey, TValue> object, you’ll have a small amount of garbage generated everytime you access the dictionary. But there’s an easy way around it : you just need to build a Comparer class for the enum type (which is under 10 lines of code) and pass it to the constructor of your dictionary.

Cheers to Nick Gravelyn for pointing out a solution to that problem.

7. Collections should be pre-allocated

When possible, you should use the parameterized constructors of all your Lists, Dictionaries, HashSets and whatever other collection types that you use, such that their backing arrays are pre-allocated to the number of elements that you plan to add to them.

Starting them with the default parameterless constructor will force the collection to grow (using Array.Resize, which trashes the old array and creates a new, bigger one) until you filled it completely.

Conclusion

That’s it for now.
I know, 7 is a terrible number for a “Top N” list, but I can’t think of other major sources of garbage that I’ve encountered. The rest goes down to good programming practices. (don’t instantiate reference types all over the place, etc.)

Hope it helped!