Archive for the 'XNA' Category

Mar 16 2010

Smart Screen-Space Blurring for Shadows

I’ve been working on a screen-space solution for blurring shadows that has the following features :

  • Variable-sized kernel based on surface distance and view angle
  • Rejection of samples if they lie on a non-contiguous surface, by identifying depth and normal discontinuities
  • Bilateral, two-pass blur filter to maximize the possible kernel size
  • Vertex and pixel shader 3.0, so no worry about 64 instructions in the PS

…but also works with the following constraints, to simplify matters :

  • Flat surfaces only, no curves! (to simplify the sample rejection process)
  • Full control over the rendering pipeline, so the “main render” pass can output weird values in order to be properly blurred

I quickly realized that it’s very hard or impossible to do a variable-sized Gaussian filter in real-time. If you change the kernel size, you change the standard deviation, and you need to recalculate the weights… this is too heavy for a pixel shader. So I chose to use a box filter with uniformly-spaced samples.
I totally would’ve liked to use Poisson-Disk distribution, but it’s not doable in a two-pass bilateral scenario. And you can’t achieve big kernels in real-time without separating the process in two passes.

My XNA3 implementation currently uses no additional render targets (!) but just a “resolve texture”, and resolves from the main buffer quite a lot. A R32F (Single) render target is used for the shadowmapping process itself, but otherwise everything’s done with a A8R8G8B8 (Color) main buffer.
The shadowmapping solution is standard orthographic/directional depth testing, but I’m using Exponential Shadow Maps (ESM) to simplify the depth biasing problems. I could never get my hands on a really good way to do Slope-Scale Biasing for standard depth testing, so ESM saves the day here. Otherwise, there’s no fancy cascading, splitting or projection tricks.

Here’s how it looks at different distances. The blur kernel stays approximately the same size in world-space even if the whole process is screen-space!

I’ll keep working on a clean sample to show, and I’ll definitely release the HLSL code if I can’t release the source to the whole thing. Stay tuned!

2 responses so far

Feb 17 2010

Common Kanji Character Ranges for XNA SpriteFont Rendering

Published by Renaud Bédard under C#,XNA

Note : This sample is practically useless, because the XNA Localization sample has a much better alternative using the Content Pipeline and character detection from resource files, which works for any language (Chinese, Korean, Japanese…). But I guess if you wanted to get the ranges of common Kanji, here’s how.

While working on Japanese language support in XNA, I realized a couple of things about Japanese writing (some of which may seem obvious, but wasn’t for me) :

  • There’s two broad character sets : Kana (syllabic) and Kanji (logographic)
  • Kana has two modern subsystems or components, Hiragana and Katakana, each with two distinct Unicode regions of respectively 92 and 95 different glyphs (187 total)
  • Kanji originate from Chinese “Han” characters, and are stored within the CJK (Chinese, Japanese, Korean) portion of Unicode. But CJK characters don’t uniquely reference Japanese logographs, and it contains over 20000 glyphs!
  • There’s over 10000 actual Japanese kanji, but only about 2000 of which every high-school grade Japanese person should know

In XNA, the SpriteFont class and its associated content pipeline use bitmap fonts internally to cache and render text strings. It becomes obvious that generating a bitmap font with 20000 Han characters would take a very long time, and is also very irresponsible memory usage. Even 10000 characters seems ridiculous.
I wanted to keep using SpriteFonts, so switching to a realtime font rendering option like FreeType was out of question. So how does one make bitmap fonts usable in Japanese?

While researching the subject, I stumbled upon a whitepaper called “Unicode and Japanese Kanji” by Tony Pottier, in which he discusses how to isolate Japanese Kanji from the CJK characters, and even dresses a list of all 1946 unique characters that are learned in Japanese education up to grade 7 (sorted by Unicode point, or by learning grade). Even if it’s a large amount of glyphs, it’s a lot more reasonable than 10k.

So the only remaining step is to make this table into a a list of XML CharacterRegion elements so that we can use them in an XNA SpriteFont declaration.
I made a little C#3 program that takes a list of Kanji, one per line, and blurts out the expected XML; it also joins the succeeding characters into regions to save space.

using System;
using System.IO;
 
namespace KanjiFinder
{
    static class Program
    {
        static void Main()
        {
            var output = new StreamWriter("regions.xml");
            var input = File.ReadAllLines("kanjis.txt");
            var writeCount = 0;
            var intervalsCount = 0;
 
            var start = (int)char.Parse(input[0]);
            var end = start;
            foreach (var line in input)
            {
                var cur = (int)char.Parse(line);
                if (cur - start > 1)
                {
                    output.WriteLine(string.Format("<CharacterRegion><Start>&#x{0:X4};</Start><End>&#x{1:X4};</End></CharacterRegion>", start, end));
                    writeCount += end - start + 1;
                    start = cur;
                    intervalsCount++;
                }
                end = cur;
            }
            output.WriteLine(string.Format("<CharacterRegion><Start>&#x{0:X4};</Start><End>&#x{1:X4};</End></CharacterRegion>", start, end));
            writeCount += end - start + 1;
 
            output.Close();
 
            if (writeCount != input.Length)
                throw new InvalidOperationException();
            Console.WriteLine(intervalsCount);
        }
    }
}

Here’s its input kanjis.txt (in Unicode format), and its result is regions.xml.

I chose to go up to Grade 7, but one may choose to ignore Grade 7 characters and just do 1-6. I don’t know whether Grade 7 characters are useful in game menus and usual dialogue.

All that’s left is to put those regions in a <CharacterRegions> tag inside a .spritefont file, and supply a valid Japanese font! Thankfully, Windows 7 comes with a bundle of these (MS Gothic, Kozuka, Mieryo and Mincho) and the M+ Fonts offer a public domain alternative.

Apart from the Kanji regions list, a couple more regions you’ll probably need :

<!-- Ideographic Symbols and Punctuation -->
<CharacterRegion><Start>&#x3000;</Start><End>&#x303F;</End></CharacterRegion>
 
<!-- Hiragana -->
<CharacterRegion><Start>&#x3040;</Start><End>&#x309F;</End></CharacterRegion>   
 
<!-- Katakana -->
<CharacterRegion><Start>&#x30A0;</Start><End>&#x30FF;</End></CharacterRegion>   
 
<!-- Fullwidth Latin -->
<CharacterRegion><Start>&#xFF01;</Start><End>&#xFFE6;</End></CharacterRegion>   
<!-- and/or the standard Latin set... -->
<CharacterRegion><Start>&#32;</Start><End>&#126;</End></CharacterRegion>

That’s it! Hope it helped.

One response so far

Oct 23 2009

Squaring The Thumbsticks

Published by Renaud Bédard under C#,Tools,XNA

Update (Oct. 26th) : Better interpolation = better shaping, and the simpler “inscribed square” method.
Update (Nov. 4th) : Exact and customizable solution!

The input values of thumbsticks on an Xbox 360 controller are always contained inside the unit circle, a disk centered on the origin and whose radius is 1. This is usually fine, because you want to interpret the values as a vector inside that circle… but if you want to provide an analog version of a D-Pad, for instance to control a character in a 2D platformer, then you’ll find that your character will never reach maximum horizontal speed unless you hit the (1, 0) or (-1, 0) vectors, exactly right or left. Anything tilted up or down will give a fractional speed, and if you hit a diagonal corner, you’ll end up with ±√½.

So what if the analog stick was a square area instead, such that the entire circumference has a least one component maxed out, like the contour of a square?
It may seem like a simple thing to do, but it proved to be a lot of trouble, and in the end I chose to use an approximation (nope, it worked out in the end!)… but here’s how I researched the problem.

Attempt #1 : Wrong way around

I first googled “Mapping circle to square“, and ended up on a blog entry that describes the inverse transform. Fine, I’ll just invert it, isolate the x and y variables as a function of x’ and y’… But that proved to be a problem.

Wolfram|Alpha doesn’t know what to do with it, and after making it more manageable (for x, for y), it gives 4 different answers that seem to work only for specific intervals, but doesn’t say which…
I tried doing it by hand and was stuck early on, because I suck at this.

And since I don’t have access to (nor know how to use) proper nonlinear equation solvers, I decided to give up.

Attempt #2 : Projections

By googling some more and varying search terms, I stumbled upon a cool flicker image titled “Conformal Transformation: from Circle to Square“. That’s exactly what I need!
But alas, it involves something called elliptic integrals and this also goes beyond my math understanding, or the amount of effort I was willing to put in this. The description also links to the Peirce quincuncial projection, which maps a sphere to a square, but the math also goes above my head (complex numbers arithmetic and again, elliptic integrals).

At this point I realized that what I was trying to do was not trivial.

Attempt #3 : Intuition serves best

So I just went back what I usually do : analyze the problem, determine what I expect as a basic solution, and work my way there.

  • Problem : The smallest horizontal component I end up with is ±√½, or ±~0.707, when the θ angle is π/4, 3π/4, 5π/4 or 7π/4.
  • Solution : Scale this to ±1, so by a factor of √2.

The scale for angles close to diagonals should fall down to identity (1) when it reaches a cardinal direction, though I’m not sure about the interpolation curve… Linear should do fine?

(code removed because it’s kinda shitty)

And for additional fun and testing, here’s how evenly-spaced random points inside the unit disk stretch to the unit square using this function (points generated by my Uniform Poisson-Disk code) :

poisson
Linear interpolation

Not bad? Good enough for me.

Attempt #4 : Interpolation and variations

I then tried different interpolation/easing methods for the scaling factor, and tried raigan’s idea of just scaling and clamping linearly the circle to a square, here’s what it looks like!

poisson
Quadratic interpolation

poisson
Inscribed square

I tried sine, quadratic and circular ease in/out/in-out from my easing functions library, and quadratic ease-in was the most balanced. It almost shapes everything to a square, very little clamping is necessary, so I’ll keep that in.

The inscribed square method is nice and simple, but cuts off a lot of values and keeps circular gradients in its area. But in the real world, for 2D platformer input, it’s probably just fine…

Attempt #5 : Epic win

So, I decided to give another shot at this after speaking on the bus with a friend about polar coordinates and trigonometry. Turns out there is a more exact solution, and by adding a “inner roundness” parameter you can also tweak how much of the center input will remain circular, while still touching the sides for circumference values.

Here is my final C# function, then I’ll explain how it works :

static Vector2 CircleToSquare(Vector2 point)
{
	return CircleToSquare(point, 0);
}
static Vector2 CircleToSquare(Vector2 point, double innerRoundness)
{
	const float PiOverFour = MathHelper.Pi / 4;
 
	// Determine the theta angle
	var angle = Math.Atan2(point.Y, point.X) + MathHelper.Pi;
 
	Vector2 squared;
	// Scale according to which wall we're clamping to
	// X+ wall
	if (angle <= PiOverFour || angle > 7 * PiOverFour)
		squared = point * (float)(1 / Math.Cos(angle));
	// Y+ wall
	else if (angle > PiOverFour && angle <= 3 * PiOverFour)
		squared = point * (float)(1 / Math.Sin(angle));
	// X- wall
	else if (angle > 3 * PiOverFour && angle <= 5 * PiOverFour)
		squared = point * (float)(-1 / Math.Cos(angle));
	// Y- wall
	else if (angle > 5 * PiOverFour && angle <= 7 * PiOverFour)
		squared = point * (float)(-1 / Math.Sin(angle));
	else throw new InvalidOperationException("Invalid angle...?");
 
	// Early-out for a perfect square output
	if (innerRoundness == 0)
		return squared;
 
	// Find the inner-roundness scaling factor and LERP
	var length = point.Length();
	var factor = (float) Math.Pow(length, innerRoundness);
	return Vector2.Lerp(point, squared, factor);
}
square
Inner roundness = 0

round
Inner roundness = 1

rounder
Inner roundness = 5

There are two concepts here :

  1. The scaling factor can be calculated for every theta angle without any need for interpolation. It happens to be the inverse of the sine or cosine of the angle, depending on which wall you’re trying to clamp to. This gives a perfect square all around.
  2. By interpolating relatively to the length of the input vector, it’s possible to preserve the circular shape in the center and still clamp to the sides for extreme inputs.

I think I’m done with this now! Yay.

12 responses so far

Oct 20 2009

A Render/SamplerState Stack and Manager for XNA

Published by Renaud Bédard under C#,Tools,XNA

I’ve been meaning to make something like this for a long time, and I know that some engines like Xen offer a viable solution already, but I wanted to make my KISS solution that interferes with the standard XNA interface as little as possible.

It’s a pretty lengthy snippet of code, so I won’t paste it all here, but basically there’s three classes : the StateStack, ManagedRenderStates and ManagedSamplerStates.

Description

StateStack is a static extension class to the GraphicsDevice. It could be just another static class, or even a global XNA service, but since it’s going to be used in combination with the GraphicsDevice most of the time, I thought it was logical to use extensions.
You can push, pop and peek the current state, which takes and returns ManagedRenderState objects. It’s very similar to how OpenGL works with the attribute stack (glPushAttrib), except you don’t choose what you push, the entire state gets pushed/popped everytime. I didn’t think isolating states in categories was useful or natural in XNA.
The ResetState and CommitState methods are shorthands to the same methods in ManagedRenderState.

ManagedRenderState is like a wrapper over a RenderState object, such that it has the same interface : a read/write property for each render state and access to the SamplerState collection.
The difference is that it tracks changes to render states as you make them, marks properties as dirty and applies only what’s needed when you decide to Commit. The Reset method refreshes the object with the current state of the GraphicsDevice; this needs to be done at the beginning of every draw call, or everytime you think the state has been tampered with.

ManagedSamplerState does the same thing but for sampler states : addressing, filtering, etc. You don’t call Reset or Commit on it directly, it gets done automatically by its host ManagedRenderState.

I added blending mode and texture filtering helper methods to both State classes, because I end up using that a lot… and with proper state management, you don’t have to worry about redundant assignments! So you can be permissive and set everything you need, every time.

Usage

So what does it look like to use it?
The Game class needs to do some initialization and per-draw-call stuff first…

protected override void Initialize()
{
    GraphicsDevice.PushState();
 
    // Other stuff
    base.Initialize();
}
 
protected override void Draw(GameTime gameTime)
{
    GraphicsDevice.ResetState();
    SetDefaultRenderStates();
 
    GraphicsDevice.Clear(Color.CornflowerBlue);
    base.Draw(gameTime);
}
 
void SetDefaultRenderStates()
{
    var rs = GraphicsDevice.PeekState();
 
    rs.DepthBufferEnable = true;
    rs.SetBlendingMode(BlendingMode.Alphablending);
 
    rs.SamplerStates[0].SetFiltering(FilteringMode.Anisotropic);
    rs.SamplerStates[0].AddressU = rs.SamplerStates[0].AddressV = TextureAddressMode.Wrap;
 
    GraphicsDevice.CommitState();
}

Then to use the state stack, you don’t ever touch the GraphicsDevice.RenderState object : always use the Managed equivalent.

// Push a copy of the state on the stack
var rs = GraphicsDevice.PushState();
 
// Always-on-top, and don't disturb the depth buffer, and no culling
rs.DepthBufferFunction = CompareFunction.Always;
rs.DepthBufferWriteEnable = false;
rs.CullMode = CullMode.None;
 
// Commit changes
GraphicsDevice.CommitState();
 
// Draw some stuff
Draw();
 
// Pop back to the last state
GraphicsDevice.PopState();
// We are now back to the last state

Download & Conclusion

I think it makes a lot of sense to stack states in a component system, because each component can push its own local state and work with it, and just dispose it afterwards. Add this to dirty-checking to remove almost all redundant state changes, and you’ve got a very usable system!

The next step is batching draw calls to keep calls that use the same RenderStates together… in my system, this is left to the user’s discretion. Kevin Gadd presented a way of doing this (and many other things including threaded rendering) on his blog.

Another rather important note : ManagedRenderState contains a MaxSamplers constant that you can tweak depending on the number of sampler states you know you’ll use. Leaving it at 16 will make update/reset/refresh operations kind of slow… not sure yet if it’s noticeable.

And because of the immense amount of copy-paste that’s been needed to produce these classes, I can’t promise that it doesn’t have a typo or two… I haven’t tested every single state. But up to now, it works great, and I’ll update it if needed.

StateStack.cs (2 kb – C# class)
ManagedRenderState.cs (30 kb – C# class)
ManagedSamplerState.cs (7 kb – C# class)

Bonuses in the code files : a method to unpack a packed color from uint to Color, and a simplistic object Pool implementation (to eliminate garbage when stacking up states).

Hope you find it useful!

5 responses so far

Sep 24 2009

WaitUntil Component

Published by Renaud Bédard under C#,XNA

Here’s a little something that I hope to use increasingly in the future : elements of functional programming to facilitate modification of state over time or game loops, without using threads all over the place. It’s nothing new, and there are other solutions (like Nick Gravelyn’s Interpolators and Timers), but I tried to make it as concise and generic as I could.

Here it is, more comments after the listing.

using System;
using Microsoft.Xna.Framework;
using Microsoft.Xna.Framework.GamerServices;
 
namespace Foo
{
    public class WaitUntil : IGameComponent, IUpdateable
    {
        public static Game Game { private get; set; }
 
        public static void GuideDisappears(Action onValid)
        {
            Game.Components.Add(new WaitUntil(_ => !Guide.IsVisible, onValid));
        }
 
        public static void TimePassed(float secondsToWait, Action onValid)
        {
            Game.Components.Add(new WaitUntil(waited => (waited as TimeKeeper).Elapsed.TotalSeconds > secondsToWait,
                                              (elapsed, waited) => (waited as TimeKeeper).Elapsed += elapsed,
                                              onValid, new TimeKeeper()));
        }
        class TimeKeeper { public TimeSpan Elapsed; }
 
        readonly Func<object, bool> condition;
        readonly Action<TimeSpan, object> whileWaiting;
        readonly Action onValid;
        readonly object state;
 
        WaitUntil(Func<object, bool> condition, Action onValid) : this(condition, ActionHelper.NullAction, onValid) { }
        WaitUntil(Func<object, bool> condition, Action<TimeSpan, object> whileWaiting, Action onValid) : this(condition, whileWaiting, onValid, null) { }
        WaitUntil(Func<object, bool> condition, Action<TimeSpan, object> whileWaiting, Action onValid, object state) 
        {
            this.condition = condition;
            this.whileWaiting = whileWaiting;
            this.onValid = onValid;
            this.state = state;
        }
 
        public void Update(GameTime gameTime)
        {
            if (condition(state))
            {
                onValid();
                Game.Components.Remove(this);
            }
            else
                whileWaiting(gameTime.ElapsedGameTime, state);
        }
 
        #region Stuff we don't care about
        public void Initialize() { }
        public bool Enabled { get { return true; } }
        public event EventHandler EnabledChanged;
        public event EventHandler UpdateOrderChanged;
        public int UpdateOrder { get { return 0; } }
        #endregion
    }
 
    public static class ActionHelper
    {
        public static void NullAction<T, U>(T t, U u) { }
    }
}

I ended up using basically a GameComponent, which means I need access to a Game instance to add it and remove it from the component collection. I decided to use a static field (that you assign in the Game’s constructor) to avoid passing it every time. It’s very unlikely that the Game instance will change or be destroyed… and I already static-ified it in my ServiceHelper earlier anyway.

I also wrote a couple (okay, two) static factory methods that are slightly fluent-interface-ey.

// (from the context of your Game class implementation)
{
    // Say "OK" when the guide stops being visible, like this?
    Components.Add(new WaitUntil(_ => !Guide.IsVisible, () => Console.WriteLine("OK!")));
    // ...or like this!
    WaitUntil.GuideDisappears(() => Console.WriteLine("OK!"));
 
    // And while we're at it...
    WaitForTwoSeconds();
}
void WaitForTwoSeconds()
{
    Console.WriteLine("Will wait for two seconds...");
    // ...recursive timers!
    WaitUntil.TimePassed(2, WaitForTwoSeconds);
}

I’ll probably add new factory methods as the needs arise, and make the class overall more useful, but I feel like it’s a good start.
I used the “GuideDisappears” method when a gamer signs out and I want to show a warning message using the Xbox Guide before going back to a sign-up screen… but since signing out is usually performed from the Guide itself, you have to wait for it to close before doing anything. This seemed like the simplest solution, and it works great.

2 responses so far

Jul 09 2009

Quick Tip : Stencil Buffer Comparisons

Published by Renaud Bédard under XNA

I just realized something today about the stencil buffer, which I love and use extensively in Fez to do tricky screen-space effects and keep everything single-pass.
Even though the way it works is really standard, documented in the APIs and even has the exact same behavior in both DirectX and OpenGL… it kinda goes against what you’d expect.

Let’s say you want to only draw pixels where the current stencil buffer value is greater than 8. In your mind, it probably goes like this (in a fictitious “foreach pixel in screen” loop) :

if (stencilBuffer > 8) draw();

…and if so, I’m totally with you. You already wrote to the stencil buffer, and now you want to compare against it. Intuitively, the reference value would be at the right-hand-side of the comparison operator.

But nope, wrong way around! This is how the GPU expects you to think :

if (8 < stencilBuffer) draw();

This OpenGL page on the stencil buffer states it clearly :

GL_GREATER : [stencil test] passes if reference value is greater than stencil buffer

This applies for Less, LessEqual, Greater and GreaterEqual operators. When you test a reference value against the stencil buffer, you test it in that precise order : the reference value goes at the left-hand-side of the comparison operator.

Hope I saved you some confusion! I sure wasted an hour on this one…

No responses yet

May 23 2009

Top Sources of Heap Garbage

Published by Renaud Bédard under C#,XNA

In the last Xbox-related entry I posted, I mentioned how the CLR Profiler can be a very useful tool to know what are the biggest sources of heap garbage in your .NET project. I used it extensively in the past month to optimize my game to run on the Xbox, and here’s a rundown of my biggest programming “mistakes” (or problematic liberties?).

1. LINQ

Even the simplest LINQ queries like :

List<List<Potato>> potatoBags;
foreach (var potato in potatoBags.SelectMany(x => x))
{
  // ...
}

…will cause a noticeable amount of heap garbage. This includes Where clauses, OrderBy clauses, everything! It’s sad because I think that LINQ is a fantastic code-thinning tool, but it’s not an option on limited-memory systems like the Xbox.

That said, if you want to use LINQ at initialization/loading time, feel free to do so. The problems only arise in update/draw calls.

2. Automatic XNB Deserialization

At one point, I got sick or writing ContentTypeWriter and ContentTypeReader classes and started building an XNB automatic serializer and deserializer based on Alexander’s (John Doe?) work on the subject. On Windows, the load times remained the same and it greatly simplified or deleted many of my content pipeline classes.

But on the Xbox the load times were horrible. Even in Release and without the debugger attached, the load times were at least 5x as slow as on my PC. I then discovered that reflection calls generate a lot of heap garbage — and it’s not even clear if memory stays allocated or if it eventually gets compacted by the GC…

So I swallowed my pride and switched back to good old Reader/Writers. But hey, now the load times are super fast.

3. Using classes when you can use structs

Coming from a Java background, I’m very used to classes and I’ll use them for pretty much everything. Even data structures that get created at runtime, because it’s so handy to have references and everyone pointing to the same object…

Turns out using structs has a lot of advantages. Intuitively I thought that the by-copy parameter passing would just make everything slower, but if you keep your structures small enough it has little to no effect on performance. The fact that they reside on the stack and not on the heap makes them a much better option for the Xbox. So datatypes like collision result objects, object identifiers, anything that you need to create often when the game is running should be made into structs.

4. Object pools (are a good thing)

Sometimes you just need to dynamically allocate reference objects in your algorithms. Or even value-types can get allocated on the heap if they’re at class-local scope. But you can minimize the damage by using object pools!

They’re really easy to set up (I found this Ziggyware article to be a good starting point) and they’ll save you heap garbage by preallocating to the number of objects you’ll actually need, and extending the lifetime of objects that would otherwise be disposable trash.

5. Removing objects from a collection that you’re enumerating

I thought I had found a really good way to fix the old problem of “Collection was modified; enumeration operation may not execute” when you remove an object from a collection when you’re foreach’ing on it :

foreach (var potato in potatoBag.ToArray())
{
  if (potato.Expired)
    potatoBag.Remove(potato);
}

It’s pretty cute, no? Very little impact on the iteration code. But it also copies the whole collection to a brand new array everytime you’re enumerating it… :(

What I ended up doing instead to minimize garbage is :

// This is allocated once, at the class-level scope
readonly List<Potato> expiredPotatoes = new List<Potato>();
 
public void Update()
{
  // Standard update
  foreach (var potato in potatoBag)
  {
    if (potato.Expired)
      expiredPotatoes.Add(potato);
  }
  // Removing pass
  foreach (var expired in expiredPotatoes)
    potatoes.Remove(expired);
  expiredPotatoes.Clear();
}

It’s certainly heavier code-wise, but at least it’s clean. And it’s faster too.

6. Enums as Dictionary keys

If you use an enum as the TKey type parameter for a Dictionary<TKey, TValue> object, you’ll have a small amount of garbage generated everytime you access the dictionary. But there’s an easy way around it : you just need to build a Comparer class for the enum type (which is under 10 lines of code) and pass it to the constructor of your dictionary.

Cheers to Nick Gravelyn for pointing out a solution to that problem.

7. Collections should be pre-allocated

When possible, you should use the parameterized constructors of all your Lists, Dictionaries, HashSets and whatever other collection types that you use, such that their backing arrays are pre-allocated to the number of elements that you plan to add to them.

Starting them with the default parameterless constructor will force the collection to grow (using Array.Resize, which trashes the old array and creates a new, bigger one) until you filled it completely.

Conclusion

That’s it for now.
I know, 7 is a terrible number for a “Top N” list, but I can’t think of other major sources of garbage that I’ve encountered. The rest goes down to good programming practices. (don’t instantiate reference types all over the place, etc.)

Hope it helped!

10 responses so far

May 11 2009

A Shared Content Manager for XNA

Published by Renaud Bédard under C#,XNA

In an average-sized XNA game, you’ll end up having many levels using many art assets, with most of them sharing textures and models between each other. Using the standard ContentManager class, the basic approach is to load all of a level’s assets into a single ContentManager, and unload it when switching levels : this way there is no possible memory leak and memory usage is kept to a minimum.

But what about load times? Users usually want level transitions to be as seamless as possible, yet we can’t just pre-load everything, you gotta watch the memory budget…

Sharing is caring

One solution is to preserve shared assets : an asset that is loaded for Level #1 and re-used in Level #2 can be kept in memory instead of being destroyed and reloaded. Memory-wise it’s costless because you were about to reload it anyway; keeping it for a longer time has no negative effect.

A simple way to keep track of shared assets is to use reference counting : increment a counter whenever you ask to load an asset, and flush assets that have 0 references when you unload. But even the almighty Shawn Hargreaves thinks it’s a bad idea…

[...] reference counting sucks for all sorts of reasons I can’t be bothered to go into here. It is better than nothing, but falls short of the automatic, rapid development approach .NET developers have rightly come to expect.

Fair enough, but how about making asset disposal transparent by using the same ContentManager containers with the same public interface, yet use reference counting in the background?

I tried doing exactly that, and had great success with it, so I suggest you take a look at the code below and give it a shot!

public class SharedContentManager : ContentManager
{
    static CommonContentManager Common;
    List<string> loadedAssets;
 
    public SharedContentManager(IServiceProvider serviceProvider, string rootDirectory) 
        : base(serviceProvider, rootDirectory)
    {
        EnsureSharedInitialized();
        loadedAssets = new List<string>();
    }
 
    static void EnsureSharedInitialized() 
    {
        if (Common == null)
            Common = new CommonContentManager(ServiceProvider, RootDirectory);
    }
 
    // This is ripped straight off the ContentManager disassembled source...
    // Wouldn't have to do that if it were protected! :)
    internal static string GetCleanPath(string path)
    {
        // Ugly, boring code that you'll get if you download the codefile
    }
 
    public override T Load<T>(string assetName)
    {
        assetName = GetCleanPath(assetName);
        loadedAssets.Add(assetName);
        return Common.Load<T>(assetName);
    }
 
    public override void Unload()
    {
        if (loadedAssets == null)
            throw new ObjectDisposedException(typeof(SharedContentManager).Name);
 
        Common.Unload(this);
        loadedAssets = null;
 
        base.Unload();
    }
 
    class CommonContentManager : ContentManager
    {
        readonly Dictionary<string, ReferencedAsset> references;
 
        public CommonContentManager(IServiceProvider serviceProvider, string rootDirectory) 
            : base(serviceProvider, rootDirectory)
        {
            references = new Dictionary<string, ReferencedAsset>();
        }
 
 
        public override T Load<T>(string assetName)
        {
            assetName = GetCleanPath(assetName);
 
            ReferencedAsset refAsset;
            if (!references.TryGetValue(assetName, out refAsset))
            {
                refAsset = new ReferencedAsset { Asset = ReadAsset<T>(assetName, null) };
                references.Add(assetName, refAsset);
            }
            refAsset.References++;
 
            return (T) refAsset.Asset;
        }
 
        public void Unload(SharedContentManager container)
        {
            foreach (var assetName in container.loadedAssets)
            {
                var refAsset = references[assetName];
                refAsset.References--;
                if (refAsset.References == 0)
                {
                    if (refAsset.Asset is IDisposable)
                        (refAsset.Asset as IDisposable).Dispose();
                    references.Remove(assetName);
                }
            }
        }
 
        class ReferencedAsset
        {
            public object Asset;
            public int References;
        }
    }
}

Notes

By design, the class assumes that all your content managers will have the same root path and use the same service provider. This version uses the constructor parameters of the first instance for all subsequent instances. It’s kind of redundant to pass those parameters everytime since they aren’t used after the first instance has been created, you can probably simplify and optimize that part (I did otherwise in my project but it’s tied to my engine code).

Content loading is not thread-safe with this method. The version I use in my project again uses a different way to initialize the common content manager and monitors, but I thought it made the implementation too heavy for demonstration… this too would need work if you use threaded loading.

It works if you use forward slashes for paths because of the GetCleanPath method. But fun fact, it treats paths and filenames as case-sensitive so it will reload assets if you change the case between loadings! So be careful with that, or fix it. :P

Usage

Here’s the procedure for level transitions :

// Create a content manager for the next level
var nextLevelCM = new SharedContentManager(Game.Services, Game.Content.RootDirectory);
 
// Load the content for this next level
var fooTexture = nextLevelCM.Load<Texture>("foo");
var barSound = nextLevelCM.Load<SoundEffect>("bar");
 
// Unload the current (old) level's content manager
currentLevelCM.Unload();
 
// Cycle
currentLevelCM = nextLevelCM;

If you unload the last level’s content manager before you load the next level’s content, all the assets will be reloaded, which renders my code useless. Make sure you follow that order!

The code can be downloaded here : SharedContentManager.cs (4 kB, XNA 3.0 / C#3.5)

And that’s it! Hope it works for you!

3 responses so far

Apr 18 2009

A Fast 2D Texture ContentImporter using DevIL.NET

Published by Renaud Bédard under XNA

As my XNA project grows larger and larger in terms of content, I try to isolate the bottlenecks of content compilation. Because even if theoretically all content builds are incremental, everytime you play with your content pipeline assemblies, everything needs to be rebuilt… and that gets seriously annoying if your build takes more than 5 minutes.

I noticed that the built-in TextureImporter isn’t exactly fast, even for very small textures (under 256×256). Looks like it uses a lot of “legacy” Direct3D code that takes a while to initialize and spends more time doing that than processing the texture. So I tried making my own simplistic TextureImporter to handle 2D textures with standard RGBA colors, but fast.

I started with the built-in System.Drawing.Bitmap class, but it doesn’t support a very wide array of image formats. I then tried using the managed wrapper for FreeImage and Tao.DevIl, both seemed a little backwards or clunky for the small amount of work I needed it to do. Then I gave DevIL.NET a shot and it does exactly what I want : load any image, get a GDI+ Bitmap object. Perfect!

Here’s the code of my FastTextureImporter, which only supports 2D textures, only parses the first mip level, and assumes a RGBA format, which should actually be fine for most textures. It supports the DDS format, which allows much more format options, but should not be used on 3D or Cube textures, and would also lose DXT compression.

using System.Drawing;
using System.Drawing.Imaging;
using System.IO;
using System.Runtime.InteropServices;
using Microsoft.Xna.Framework.Content.Pipeline;
using Microsoft.Xna.Framework.Content.Pipeline.Graphics;
using Color=Microsoft.Xna.Framework.Graphics.Color;
 
namespace MyContentPipeline.Importers
{
[ContentImporter(".bmp", ".cut", ".dcx", ".dds", ".ico", ".gif", ".jpg", ".lbm", ".lif", ".mdl", ".pcd", ".pcx", ".pic", ".png", ".pnm", ".psd", ".psp", ".raw", ".sgi", ".tga", ".tif", ".wal", ".act", ".pal", 
    DisplayName = "DevIL.NET Texture Importer", DefaultProcessor = "TextureProcessor")]
public class FastTextureImporter : ContentImporter<Texture2DContent>
{
    public override Texture2DContent Import(string filename, ContentImporterContext context)
    {
        var content = new Texture2DContent
        {
            Identity = new ContentIdentity(new FileInfo(filename).FullName, "DevIL.NET Texture Importer")
        };
 
        using (var bitmap = DevIL.DevIL.LoadBitmap(filename))
        {
            var bitmapData = bitmap.LockBits(new Rectangle(0, 0, bitmap.Width, bitmap.Height),
                                             ImageLockMode.ReadOnly, bitmap.PixelFormat);
 
            int byteCount = bitmapData.Stride * bitmap.Height;
            var bitmapBytes = new byte[byteCount];
            Marshal.Copy(bitmapData.Scan0, bitmapBytes, 0, byteCount);
 
            bitmap.UnlockBits(bitmapData);
 
            var bitmapContent = new PixelBitmapContent<Color>(bitmap.Width, bitmap.Height);
            bitmapContent.SetPixelData(bitmapBytes);
            content.Mipmaps.Add(bitmapContent);
        }
 
        content.Validate();
        return content;
    }
}
}

I tested it with all the 2D textures in my project, and the build speed difference is really noticeable. It also accepted all my textures, although I don’t use DXT compression nor exotic formats like A8/L8, so I can’t guarantee that it works for everything.

This class should go in a ContentPipelineExtension project which has a reference to the DevIL.NET2 assembly, and the DevIL.dll file in its build path. You can get those on DevIL.NET’s website; even though the last release dates from 2007, it works great.

2 responses so far

Apr 07 2009

Things you should know before/while making an Xbox XNA game

Published by Renaud Bédard under C#,XNA

So I started toying around (read: developing full-time) with Xbox programming using XNA GS 3.0. In fact I took a big Windows Game project with many satellite Game Libraries, a Content Pipeline Extension, a content editor, an automatic serialization library… and “ported” most of it to Xbox. But since the editor will remain Windows-based, the engine and most of the code needs to stay cross-platform, compatible with both Windows and Xbox.

And I hit a few walls.

I feel like these are things many people working with XNA will encounter. XNA’s been around for a while now, since version 1.0. Many of these things are already widely discussed in the blogosphere and forums. Also, I’m aware that GS 3.1 is around the corner and it’ll address at least the first point of my rant…

Still, here’s a handful of things that surprised or annoyed me in the transition :

1. ContentTypeReaders need to stay out of Content Pipeline Extension project(s)

Suppose you have a pretty big project that has custom datatypes, and those datatypes are compiled to XNB files using custom ContentTypeWriters and then read back using ContentTypeReaders. You usually need a Content Pipeline Extension project for that, and this project would reference your Engine or whatever project owns the datatypes that you want to compile.

Before very recently, I never quite understood why all official samples had the Reader classes in the Game project, while the Processors, Writers and Importers were all in the Content Pipeline Extension project. Why decouple it like that, and why join them with a fully-qualified assembly string in the GetRuntimeReader method of Writer classes? Moreover, putting Readers in my Extension project always worked in my Windows-only solutions, and it all felt nice and clean.

But when doing everything for Xbox and Windows, the reason becomes clear…

The Content Pipeline Extension project is a standard C# project in XNA GS 3.0. Not a “Game Library” project or any other special container. This means that it won’t be duplicated if you do an Xbox version, and it makes sense; you only need to compile content on your Windows machine.

So your Content Pipeline Extension needs to have a reference to your content datatypes, in some Game Library project. Since the Extension is for Windows only, it’d reference the Windows version of that Game Library. And then if your Readers are in the Extension, your Xbox game needs to reference it to load assets… which means the Xbox and Windows versions of your data structure Library project would coexist on the Xbox. This can’t work!

Besides, the Xbox project doesn’t need to access Processors, Importers and Writers. All it needs to be able to read content and then use it. These other content pipeline classes may even use Windows-specific assemblies like GDI+, why not? They’re certainly useful for image processing.

So bottom line, keep Readers in your Game project if it’s a small project, or in a Game Library for both Xbox and Windows. And if your Content Pipeline can reference this Readers-container, then no need for a hardcoded String for GetRuntimeReader, you can just get the assembly-qualified-name from Reflection classes!

2. The Xbox doesn’t like garbage

The first thing I noticed after I got the game running were hiccups in the framerate, every two seconds or so. But the framerate apart from that was a constant 60. I half-expected this,… it’s the garbage collection.

This paper by three people at the FZI Research Center for Information Technology explains it much better than I can, so I’ll just quote them… :

The .NET Framework on PC uses a generational approach, making garbage collections less painful and more resilient to large numbers of objects. With 512 MiB of GDDR3 memory shared between GPU and CPU the Xbox 360 garbage collector can’t afford such luxury.
The garbage collection of the .NET Compact Framework for the Xbox 360 always has to check all objects. Therefore the time a collection takes increases linearly with the number of objects. Further a collection will be triggered whenever 1 MiB of memory has been allocated.

This means you really need to stop carelessly allocating to the heap when doing an Xbox game. There are several “known causes” of heap garbage with the XNA Framework, but it’s easy to start going on a witch-hunt and replacing all foreach(in) statements by plain for(;;) or stuff like that… It’s a much better idea to find out what are the bottlenecks in your application and fix them starting by the bigger ones. You’ll probably end up solving most of the jittering without making your code look like C.

The above paper presents some options for memory profiling like the CLR Profiler and XNA Framework Remote Performance Monitor, both of which I have yet to try, but sounds like excellent free tools to address this issue.

Update : See this post for more information on typical causes of heap garbage.

3. The Compact .NET Framework needs your help

The Xbox .NET implementation is not the full-blown framework, it’s based on a subset called the Compact Framework, which is also used on mobile devices and embedded systems. This comes at a small cost : you have to complete it to fit your needs.

It’s actually pretty cool that LINQ is supported and all 3.0 features work flawlessly. But here’s a short list (off the top of my head) of things I found missing, some important, some easily worked around, all of them at least mildly annoying… :

  • Enum.GetValues(), Enum.GetNames(), Enum.GetName() : These are all missing from the CF. There is an old thread on the XNA forums that proposes alternatives that use Reflection. I found them to be working great.
  • ParameterizedThreadStart : You can’t start a Thread with a context object in the CF. You then need a shared context object in the parent class.
  • Type.GetInterface(string, [bool]) : You can’t query the interface of a type via reflection in the CF… at least not a single one by name. The GetInterfaces() method is supported, so might as well just use that.
  • Math.Log(double, double) : Actually, there is a Log(double) function, but it’s with the natural base. The custom-base one is not supported. Seriously? (and I know, the workaround is a one-liner)
  • HashSet : I love the .NET 3.5 HashSet generic class. It’s really complete, super fast… but the CF doesn’t have it. I ended up faking one with a Dictionary as a backing collection, and rewriting the set operators (UnionWith, IntersectWith, etc.) that I really used.

4. You’ll pretty much need a Content project

I don’t like the fact that in a standard XNA project, the content is compiled at build-time, in the Visual Studio IDE. For many reasons… one of them being that I’m not the one producing the content, the artist does, and he certainly doesn’t want to have Visual Studio installed. Another one being that my Content Processors are super heavy and VS sometimes crashes with a OutOfMemoryError before the build completes.

So what I did is write a content compiling tool that uses the MSBuild API to generate something like the .contentproj (yet simpler) based on the filesystem automatically, and compile it externally without needing Visual Studio. This works really great for Windows, we’ve been using it for months now.

But for Xbox… the deployment process is also tied to Visual Studio. And I’m not expert enough at MSBuild technologies to be able to replicate deployment outside of it. So I ended up making a content project only for the Xbox version of my Game project, and compile it separately when I need to test on XNA Game Studio Connect. This works OK, but I’m still a little unhappy about this whole Visual Studio dependency. I hope they look into it properly in the future.

So I was under the impression that you needed to have a content project for Xbox deployment, but Leaf (first comment) pointed out two ways of handling content compilation outside of Visual Studio. I’m definitely going to use the first one!

And…

These are the big points for now. Stuff I thought about adding : RenderTargets act different on Xbox and PC (but Shawn Hargreaves already blogged extensively on the subject and it’s way better than it used to be), Edit-And-Continue is not supported when debugging on the Xbox (but that would’ve been asking for the moon!), you get different warnings for shader compilation when targeting the Xbox360 platform so you should pay attention to that, etc. etc.

I’m still very much halfway through the conversion process, and I’m still learning, and still discovering oddities. If you have advice or corrections, please let me know through comments! On my part I’ll keep this post updated if I hit another big wall.

3 responses so far

Next »