Updated September 7th 2012 : New OggStream class with better support for concurrent stream playback.
I was looking for a suitable replacement for the audio streaming and compression capabilities of XACT when porting an XNA project to MonoGame, and it doesn’t look like there’s a clear winner yet. MonoGame contributors suggested NAudio, but it looks like work needs to be done to make it portable, and the sample code is a mess. FMod EX or competing commercial solutions are an easy but costly choice. So I turned to OpenAL to see if it can be a free and usable solution for streaming compressed audio with some DSP capabilities.
T’was a bit challenging, but not impossible! :)
Out of the box, OpenAL doesn’t support being fed MP3 or OGG sources. There are extensions for those, but according to one implementation, they’re deprecated. So you need to handle decoding yourself and feed the PCM bitstream to OpenAL.
It sure would be nice to have a purely managed implementation of libVorbis, but it doesn’t exist, so there’s a dozen homemade decoders floating around open source code hubs in various states of workability. I was pointed to NVorbis by TheGrandHero on the TIGSource forums, and I haven’t found a better alternative yet. CsVorbis is another, but it doesn’t support streaming, all the decoding is done up-front, which defeats the purpose. OggSharp is just a fork of CsVorbis with XNA helpers, so nope. TheGrandHero also mentioned trying out DragonOgg but having problems with it.
NVorbis worked like a charm for me, but it’s pretty early and doesn’t support some features like seeking around the stream, so looping or restarting playback requires creating a new whole new reader/decoder. I also took some time to optimize the memory usage in my fork of the project.
07/09/2012 Update : Andrew Ward, the author of NVorbis, resolved the memory allocation problems that the version I forked off had, so I pulled the new changes out instead.
Once you have some decoded data, you have to make OpenAL stream it. This is sort of tricky but well-documented.
(this image shamelessly stolen from Ben Britten’s blog entry linked above)
The basic idea is the following :
- Generate one OpenAL source for your sound file, like XACT cues
- Generate 2 or more OpenAL buffers
- Fill at least one of those with the first samples of the sound and enqueue it/them to the source
- Start playback of the source; it’ll play all the buffers associated with it, in order
- In a background thread :
- Query the source to know whether buffers have already been processed
- If so, dequeue those buffers, refill them with fresh data and re-enqueue them
In practice, since it involves threads, it’s a bit more obtuse than the pseudo-code, but OpenAL makes it relatively painless. The trick is to read enough data and often enough to avoid buffer underruns.
Then, if you want to loop the sound, it’s not as easy as setting the source’s “Looping” parameter to true, because the buffers never contain the full sound file. Instead of no longer feeding the buffers when you hit the end of the Ogg stream, you just start back at the beginning and feed continuously, which has the nice side-effect of being 100% gapless.
Filters and effects
Finally, I wanted to have one fancy effect that XACT provided : low-pass filtering. This is used extensively in FEZ as a gameplay mechanic, so I could hardly live without it in MonoGame ports.
Thankfully, OpenAL Effect Extensions (EFX) provide cross-platform effects including filters, at least in theory. In reality, this depends on whether the driver implementation supports them, and even the Creative reference Windows implementation doesn’t on my system.
I was able to find a software implementation that does though, OpenAL Soft, and it’s cross-platform, so that bodes well.
To override the installed implementation, just supply the software DLL in the application’s directory and voilà. Had no problems with it up to now, performance or otherwise.
Plus, it comes with a console application that outputs which EFX and other extensions are supported in this implementation. This is handy to detect whether the right DLL’s been used, and helped me figured out that the Creative implementation didn’t support any filter. Here’s what it should say :
The result of all of this is a OggStream class that is in my fork of NVorbis on GitHub, which you can find here :
Update : Version 2.0 comes with a sample console application which allows you to test and visualize how different streams get buffered and when buffer underruns occur in a nice concise format. I’m really quite happy about it, give it a shot! Here’s how it looks :
Legend of the symbols that this app blurts out :
(* means synchronous buffering (
Prepare()) has started, and
) means it ended.
. means that one buffer has been refilled with fresh samples
| means that there are no more samples to consume from the sound file
! means that playback stopped because of a buffer underrun and had to be restarted
} represent calls to
] represent calls to
v in prefix means respectively that the stream is looping, fading the low-pass filter in/out or fading volume in/out
My code has only been tested on .NET on Windows, but I don’t see why it wouldn’t work in Mono either.
Like all the unlicensed content on this blog, it’s public domain, but attribution is appreciated.