Fez – The Instruction Limit

FEZ 1.12

A bit more than a year ago, I sent this email to Ethan “flibitijibibo” Lee :

fna

And work on FEZ 1.12 officially started.

The goal of this large update to the Windows PC/Mac/Linux version of FEZ was the following :

Cut dependencies to OpenTK, the platform framework used by FEZ on Windows. I have had problems with it from the start, from sound card detection issues to windowing problems, to VSync and fullscreen issues… I wanted to give SDL 2.0 a shot, to see if it fares better.
Have more efficient music streaming. PC + Mac versions of FEZ used a C# Ogg Vorbis decoder called NVorbis, which seemed like a good idea because it would run on all platforms. I also wrote the streaming code that uses NVorbis and OpenAL, and it made its way into the main MonoGame repository! But it’s also very slow, resource-intensive and heavy on disk access. So I wanted to look into a better solution that wouldn’t break music playback in areas like puzzle rooms and the industrial world.
Have a single codebase for all PC + Mac versions of FEZ. As it stood with 1.11, there was a slightly modified codebase for Mac and Linux that ran on a weird hybrid of MonoGame and what would become FNA, called MG-SDL2. The PC version ran on my fork of MonoGame ~3.0 which I did not do a great job of keeping up to date with upstream changes, because when I did it usually broke in mysterious ways. This is not great for maintenance, and centralizing everything on a clean FNA back-end, with as little platform-specific code as I could, seemed like a good idea.
Make it the Last Update. Since I shipped FEZ 1.11 I had little intention of making additional fixes or features to the game because I simply don’t have the time with a kid and a fulltime job… and working on FEZ is getting old after 9 years. So I did want to address problems that people have with the game, but I don’t want to do it for the rest of my life. I had spent enough time away from the game that I was somewhat enthusiastic about coming back to it, especially if it’s at my pace, and that it’s my last time doing so.

So I didn’t announce anything, I didn’t announce a date, and I slowly chipped away at making this humongous update to FEZ. It’s been in beta-testing with an army of fans, speedrunners and friends since late January 2016 and over 120 bugs have been reported and fixed.

You can read the full 1.12 change log here (it’s also bundled with builds of the game), but I wanted to cover in more detail a few big changes that caused more headaches than I had anticipated.

Singlethreaded OpenGL

FEZ uses a loading thread so it doesn’t block the draw/update loop while levels are loaded. This loading process includes file loading, but the bulk of load times are spent interpreting loaded data : building models, uploading textures to GPU memory, creating shaders, calculating helper structures for collision, etc.

The original XNA version of the game did everything on the loading thread and DirectX 9 somehow made sure that everything would be just fine, even if calls to the graphics driver were made on a thread that wasn’t the main thread.

From version 1.0 to 1.07 of FEZ (PC/Mac/Linux), I used the background OpenGL context that MonoGame provides, so that I didn’t have to retool all my loading code, or slow everything down by synchronizing to the main thread on every GL call. This worked fine… mostly, depending on the driver, on some platforms. My development setup worked pretty good, but I had reports of awful load times on Mac, and on AMD graphics card; clearly this wasn’t good enough.

In FEZ 1.08, I got rid of the background context in favor of a call queue for GL operations that could wait until the next draw to be done (in order), and blocking GL calls for the ones that needed to be done instantly. This was minimally invasive and worked pretty well, but slowed down load times for setups that ran fine before. Also, what is considered a GL call or not is not known or defined by FEZ at this point; MonoGame internally does a check whether we’re on the main thread for every XNA call that involves GL, and if so, uses a closure to defer the call to the main thread (usually by blocking). This was good enough, but not great.

When switching to FNA, I wanted to take advantage of its “Disable Threading” mode which boosts performance and lowers GC strain because it doesn’t need to check whether you’re on the main thread on every graphics function; if you guarantee that you’ll never, ever use them on a secondary thread, it takes your word for it! This means that FEZ would need to do deferral/blocking itself. The loading code had to be retooled and verified throughout the game.

I ended up using a simple method very similar to what MonoGame did : the Draw Action Scheduler. I got rid of the “I need this now!” blocking calls (e.g. loading the sky texture and then instantly after, sampling its pixels for the fog color), and made sure that FEZ could load and process everything on its loading thread before the first draw could be executed, which unqueues and executes these draw actions. To keep the smoothness benefits of having a loading thread, I had to tweak granularity; sometimes it’s better to have a bunch of smaller actions that can be run while the loading screen renders, instead of having one big task that causes lost frames.

Here’s a fun one : I didn’t want to change FNA’s code, and I wanted a Texture2DReader that’s safe to call on a loading thread… so I wrote a FutureTexture2DReader that does file reading inline, but then lets you upload the texture to GPU in a second step :

var futureAtlas = input.ReadObject(FutureTexture2DReader.Instance); DrawActionScheduler.Schedule(() => existingInstance.TextureAtlas = futureAtlas.Create());

I also realized that there are a lot of hidden GL operations here and there, that only happen in some circumstances, and that can blow the game up big-time if you’re not careful. There’s no safety net in FNA’s no-threading mode, so you have to be really confident that it’s 100% covered. I’m pretty confident, after a year. :)

Screen scaling modes

The saga of FEZ resolutions and black-bars is hard to justify. The excuses have varied from “it was meant to be run in 720p, so we only offer multiples of that” to “okay I guess we can do 1080p but it won’t look great” to “okay I guess we don’t actually have to pillarbox, but sometimes we’ll still letterbox”.

I think most people will be happy with the implementation we chose to go with in 1.12 :

No more black bars. The handful of situations where black bars were still required (mostly because I had been lazy and assumed a 16:9 aspect ratio) have been reworked. The only downside is that you might see the vertical ends of a level if you try hard enough, but it’s worth the presentation overhaul. One sensible exception : if you use a resolution that does not match your display adapter’s aspect ratio in fullscreen, the game auto-detects it and adds black bars so that the game does not appear distorted.
You choose the scaling mode. Not literally 1x/2x/3x because specific levels have control over that, but you choose whether you want the intended zoom level, prefer pixel-perfect scaling (which may cause a wider-than-intended zoom), or want to compromise with a supersampled view at the intended zoom level. The latter option is my favorite because it has no impact on pixel-perfect resolutions (like 720p and 1440p) but, for instance, will render with a 1440p backbuffer in 1080p in order to provide minimal pixel crawl/jitter… and provides anti-aliasing in first-person mode and whenever rotating objects are used. It’s a bit softer so people might prefer not to use it; but it’s an option!

Visual interpolation between fixed timesteps

Let me tell you the story of a .NET API misunderstanding that has deep consequences…

Let’s say you use the TimeSpan.FromSeconds() factory method to make a TimeSpan object that represents the duration of a 60hz frame. You’d use it like this :

var frameDuration = TimeStep.FromSeconds(1 / 60.0);

And I wouldn’t blame you for it. The method takes a double, there’s no indication that it would do any kind of rounding… but if you read the documentation :

millisecond

It’s not because TimeSpan’s precision stops at milliseconds, it store everything in ticks. It makes no sense. But here we are.

For the 5 years of its development, FEZ was designed with a 17ms timestep because of this issue. The physics were tweaked with a 17ms fixed timestep in mind, and yep, it skipped frames like it’s nobody business. Because it ran at approximately 58.8235294 frames per second, instead of the intended 60.

To help with this in FEZ 1.08 (PC/Mac/Linux), I decoupled the update and draw calls such that more than one draw can be done for one engine update. This eliminates tearing with a 60hz V-Sync, but once or twice every second, the same frame is presented twice in a row to the screen, which makes the game feel jittery. It was relatively minor, so I let it slide.

Fast-forward to 1.12, in which I decide to try and support my fancy new 120hz monitor properly. Drawing frames twice isn’t exactly great, it does make the game match the monitor’s synchronization but it doesn’t look any better than 60hz. Then Gyoo is play-testing the game and notices the original issue, that the game jitters even at 60hz… and it sinks in that the same root cause is making the game locked at an update framerate that doesn’t make any sense. I have to do something about it.

There’s two ways to go here, and both are painful : interpolate, or switch to a variable time-step. I already had tried the latter option when still developing the game for Xbox 360 and it’s very hard to pull off. The potential for hard-to-reproduce physics bugs is real, and it would mean retesting the whole game many times until we get it right. However, some parts of the game are easy to transition to a variable timestep in isolation since they don’t depend on the game’s physics, or have no impact on gameplay. So I went with a hybrid solution :

Gomez uses interpolation. This means that for every update, the next frame’s position is also computed, and when Gomez gets drawn, he gets interpolated to the right position between those two frames depending on timing.
The camera, sky, moving platforms, grabbed/held cubes and cube bits uses a variable time-step. These could relatively easily be transitioned to compute their movement/position per-draw instead of per-update, which was an instantaneous boost to fluidity, especially at framerates higher than 60hz.
Everything else is still at 17ms fixed timesteps. It turns out that it doesn’t matter for most entities to have fully smoothed movement, especially at that chunky world resolution, and with a fully smooth camera. So I stopped there.

This sounds like a fun time, but I’ve been working on regressions that these changes caused since I started the task back in April 2016. There were a lot of corner cases, places where the camera’s position was assumed to be the same in matching update and draw calls, jittering stars and parallaxed elements and… the list goes on. But the result is there : the game looks really, really good in 120hz right now.

Additional reading

Ethan wrote a whole series of posts during FEZ 1.12 development that covers other things that the patch addresses :

Special Thanks

First off, huge, HUGE thanks to Ethan for making FNA in the first place, motivating me to finish this patch, and working tirelessly on it since day one. Your support, help and dedication blow me away. If you appreciate the work he’s put on FEZ 1.12, consider supporting him on Patreon.

And then, the amazingly helpful testers that have been doing 1.12 testing on their free time to help the project, in no particular order (and I hope I didn’t forget anyone!) :

Thank you all so much, you made this possible. <3

As you can see, a lot of them are speedrunners. I’m very grateful for the passion of the FEZ speedrunning community, watching things like Vulajin run FEZ at AGDQ last year was amazing; I wanted to make the game solid for you guys!

Thanks for reading, and please enjoy FEZ 1.12!

Wrap texture adressing within a sprite sheet or atlas

FEZ shipped with volume textures (aka 3D textures) for all the sprite animations in the game. Gomez, NPCs and other animated pixel art were all done using those. This was a tech call that I made way back in 2008 and kept with it because it makes more sense than you might think :

No need to do texture packing and keeping track of where frames are in the sheet; a volume texture is an ordered list of 3D textures, every frame is a slice!
The pixel shader just does a tex3D() call with the Z component of the texture coordinates being the step of the animation between 0 and 1.
Cool side-effect : hardware linear interpolation between animation frames! This wasn’t very useful for me (except for one thing, water caustic overlays), but it’s a nice bonus.
Mip-mapping with 3D textures is problematic because it downsizes in X, Y and Z, meaning that each mip level halves the number of frames. However, I didn’t need mip mapping at all (for sprites), I never undersample pixel art.
Same limitation when making a volume texture power-of-two, it also goes power-of-two in the Z axis which means a lot of blank frames, which is wasteful but not a huge problem to deal with.

But while I haven’t done real testing, one can assume that they’re slower than a regular 2D sprite sheet, and they imply that you have one texture by animation, which restricts how much you can pack things together. Creating a volume texture at load-time with XNA Texture2D.SetData() calls means one call per animation frame, which is noticeably slow. Also, volume textures are not currently supported by MonoGame, and I assume some integrated graphics hardware would have trouble dealing with them.

So the more traditional alternative is using a sprite sheet, which is easy to make using tools like the Sprite Sheet Packer.

But then what if you need to use wrap texture addressing on it, to have horizontally and/or vertically repeating textures?

If you only repeat on one axis, have relatively small textures and a small number of frames, you can force the texture packer to layout the sprites on a single row or column, which allows wrapping on the other axis.

This worked for some animations, but some were just too big or had too many frames to fit it in under 4096 pixels. In that case, there’s one final option : pixel shaders to the rescue!

When addressing the texture in your shader, you’re likely to use a 3×3 texture matrix, or a 4D vector if you’re short on input parameters. Either way, you have four components : UV offset and UV scale. You can use those to manually wrap the texture coordinates on a per-pixel basis. In the sample below, I extract the data from a texture matrix.

Vertex Shader

Out.TextureCoordinates = mul(float3(In.TextureCoordinates, 1), Matrices_Texture).xy;
Out.UVMinimum = Matrices_Texture[2].xy;
Out.UVScale = float2(Matrices_Texture[0][0], Matrices_Texture[1][1]);

Pixel Shader

float2 tc = In.TextureCoordinates;
tc = frac((tc - In.UVMinimum) / In.UVScale) * In.UVScale + In.UVMinimum;
float4 sample = tex2D(AnimatedSampler, tc);

The frac() HLSL intrinsic retains the decimal part of its input, which gives the normalized portion of the texture that the coordinates are supposed to show. Then I remap that to the sprite’s area in the atlas, and sample using those.

I ended up only needing wrapping on one axis for that big texture/animation, but this code does both just in case. This is WAY simpler than customizing the vertex texture coordinates to allow wrapping.
One caveat though, this won’t play well with linear filtering. Since FEZ is pixel art, I could get away with point sampling and had no artifacts there.

P.S. A simple fix to enable usage of linear filtering : pad the sprites with 1 pixel column and rows of the opposite side of the texture! (and don’t include those in the sampled area; it only gets sampled by the interpolator)

Cubes All The Way Down @ IGS (GDC)

This again?!

I re-did my slides and my talk at the Independent Games Summit of the GDC 2012. It grew from a measly 42 slides to a healthy 62, so there is more content, many more videos, and incorporates some of the feedback I had about the MIGS version.
Update : it’s on the GDC Vault, (no membership required!) if you want to see me give the presentation.

Without further ado, here are the slides in different formats :

It’s Cubes All The Way Down (PDF format) – (PDF with Notes) – (PPTX format)

And you can download the associated Videos and songs (179Mb!)

Cubes All The Way Down @ MIGS

Back in November 2011, I gave a talk at the Montréal International Game Summit in the Technology track called “Cubes All The Way Down”, where I talked about how FEZ was built, what’s the big modules, the challenges and intricacies of making a tech-heavy indie game from scratch.

It went okay.
I was really stressed, a bit unprepared due to FEZ crunch time, and just generally uncomfortable speaking in front of an audience.
I spoke so fast that I finished 15 minutes early and had 30 minutes for questions, which worked great for me because the relaxed setting of a Q&A session meant better flow, better information delivery, I really liked that part. Also I had friends in the front row that kept asking good questions and were generally supportive, so all in all a good experience. :)

I was asked about giving the slides out, so here they are! Unedited.

It’s Cubes All The Way Down (Powerpoint 2007 PPTX format) (PDF format)

Enjoy!

Behind Fez : Collision and Physics

Here’s the third part of the “Behind Fez” series (part 1, part 2), in which I’ll try to explain how collision detection and physics work in Fez.

And yes, Fez is still actively developed in all areas. Making a game on your own : IT’S HARD.

Collision Engine as of Early 2008

Back when we made the IGF 2008 build, we had at least two massive limitations that made culling and collision detection very simple :

The world was completely static. No moving platforms, no physics except for the player sprite, you can assume that if something is at one place at level-loading time, it’ll stay there until you quit the game. (as a matter of fact there WAS only a single level, but that’s another topic)
Everything was aligned to the world grid. Everything took up a full cell worth of collision boundaries, nothing bigger or smaller than 16x16x16 trixels – exactly one trile.

The super-blocky world of Fez circa 2008

This allowed me to only calculate collision detection of the player’s collision rectangle (for each of its 4 vertices, point-to-line) whenever it traversed a world grid “line”, and since this was done very rarely, optimization was not an issue, and I went with the most intuitive and naive way possible.

Consider the world as a 3D array (or whatever indexed data structure you can think of) with filled or empty spaces, and each filled space containing visual and physical information. Visual information consists of the polygonal mesh, the textures, etc. Physical information defines if that trile should collide with the player, and from which of its 2D boundaries.
We decided early on that the three possible “collision types” are : no collision, top-only collision (for fall-through/climbable platforms) and all-sides collision (for blocking level boundaries or obstacles).

The three different collision types as seen in Fezzer

This way of mapping world entities with their collision information is elegant because the level designer doesn’t need to paint a separate collision map, or add invisible objects that act as colliders. It also means that any change you make to the level visually is propagated physically to how it plays.

Fez is obviously played from a 2D perspective. The collision results must match what the player sees, and visibility works front-to-back, with only the top-most layer being visible and active.
Knowing the collision type of each and every space (if filled), it’s easy to find the 1D “row” of possible colliders if you have the 2D screen coordinates in hand. Then you just traverse front-to-back, and the first hit is kept, at which point you can early-out from the loop.

Depth “Collision”

So now I know what’s blocking the player in 2D. But we had to make additional rules for the Z position or depth of the player, so that the game would behave like a 2D platformer AND still make sense in the 3D world :

Gomez should stay visible. He should stay on-top of the world geometry as long as he doesn’t rotate the viewpoint. This is done by correcting the depth such that Gomez stands right in front of the geometry.
Gomez should never walk in mid-air. In 2D this is solved by the collision detection, but in the remaining axis it needs to be enforced, such that Gomez stands on the platform nearest to the camera (this is an arbitrary rule-of-thumb that we chose).
Otherwise, don’t change Gomez’s depth for no reason. The player expects it not to change. It’s really easy to get lost in Fez, and if the engine messes up the little spacial perception you’ve got left, it’s not fair anymore.

A hilarious mock-up that shows rules 1 & 2 of depth adjustment

The player will never see that Gomez moves around in the Z axis because the view is flattened and it has absolutely no depth perception, so we can do all we want to ensure that rules 1 & 2 are enforced.

Breaking the Grid (Late 2008 to Late 2009)

So that was good until we decided to implement crate physics, moving platforms, offset triles and variable trile size. Then, this happened :

Every rule defined above has to be tested every time the player moves. If the triles aren’t aligned to the world grid, the “only test when a grid line is traversed” trick won’t work anymore.
For both culling and collision, the world grid stops being an exact reference of how the world appears/behaves, and more of a helper structure where more than one trile can be in a cell, and some triles overlap many cells.
Collision stops being specific to the player, it needs to be generalized in order to support particles and other objects that should have all the same 2D/3D tricks.

The many joys of offset and oddly shaped triles

None of these problems is trivial, but the hardest by far to implement was #2. Thing is, I didn’t want to throw away everything I did and start over. So I made small, incremental changes until the new features were supported. And just by then I had ~1.5 years worth of C# code to maintain…

To explain my final approach, I need to specify that “variable trile size” does not mean that a trile can be bigger than 16x16x16, only that its collision volume can be smaller.
With that in mind, here’s what I did :

A trile always has a majority of its volume within a single world cell, even if it’s oddly shaped or positioned arbitrarily. In other words, a single cell holds the center of a trile. This world cell is where it’s stored.
When colliding a vertex of a collision rectangle to the world, look up the 4 nearest 1D “rows” (in 2D screen space) of possible collider cells from the world grid. Traverse front-to-back each of those rows, and test if one of the triles contained at each level ACTUALLY collides with the point, taking in consideration the trile’s positional offset and size. The 4 neighbour rows need to be tested because triles within these rows may exceed the cell boundaries by up to 50%!
When triles move, update their location within the world grid only if the center changed to a new cell.

So everything’s covered, we’re good! Right?

Optimization

But it was slow as molasses. I do many, many more checks than I did before, and especially on the Xbox where the JIT compiler is less efficient, all those random accesses killed the game’s performance. Truly a case of CPU/Memory bottlenecking.

This section is a work-in-progress… As long as I’ll be maintaining/developing the game, I’ll worry about it going too slow. But here’s the steps I’ve taken up to now :

After every camera rotation, cache the nearest and farthest trile for each screen-space world grid location. This way, I don’t have to loop through the entire level boundaries and test for trile presence, I know that within these cached bounds, I have data. Parts of the cache need to be invalidated every time an object moves. The caching process of the whole level has to be done in another thread while the rotation happens, else it pauses the game for ~250ms… And threading is a headache.
Simplify the algorithm for particles and other small objects. The player won’t notice if particles physics aren’t 100% accurate; I can reduce the collision points to a single centered one, and ignore some rules.
All the standard optimization techniques… Avoid dynamic memory allocations. Ensure cache spacial locality (still struggling with this one). Start up ANTS Profiler, find a bottleneck, eliminate it, rinse, repeat.

The problem with the world being so dynamic is that I can’t precache everything. I certainly can’t precache the collision result of every pixel of the screen everytime a viewpoint rotation occurs.
Separating dynamic objects from static objects and treating their collisions separately is something I’d like to try if necessary. But it means so many changes to the current system that it scares me a little bit.

Exceptions

In Mid 2008, we decided to implement something called Big Art Objects, which are like triles but bigger than 16³ (they can go up to 128³). They are sculpted like triles, but they don’t have any collision information attached to them, because they stick out of the “world grid” system.

A tree art object, with its collision triles faded in. (T = top-only, π = no-collide)

To make them look like they’re standard world objects, we fill them with invisible collision triles. (yes, I said we wouldn’t need those, but that’s a special case :P)
It’s worth it in the end, because they look fantastic and break the mold of lego-like blocky structures.

Another common exception is what we call immaterial triles. They’re no-collide triles that ALSO don’t make Gomez go in front of them. Strands of grass can pass in front of Gomez, it just looks better that way.

I could go on about the other exceptions, but then I’d reveal features that we haven’t announced or shown. So I’ll just stop now and let your imagination do the rest. :)

Behind Fez : Trixels (part two)

I decided that I would write a more detailed article about the rendering module of the Trixels engine. Many things changed implementation-wise since last year, and I feel that now is a good time to go public as it’s probably not going to change much anymore.

I really don’t mind “coming clean” about how things are done, since Fez certainly didn’t invent non-interpolated orthographic voxels or whatever you might call trixels in a more formal language. Trixels are just a technology support for Fez’s art style, not necessarily the best, but one that works and that I’d like to share.

You should probably check the first post I made about trixels to get the basic idea first.

Here’s the trile I’ll dissect for this post, in a perspective view :

Memory representation

Each trile at its creation is a full 16³ cube, without any holes. The editing process consists of carving trixels out of the full shape to get a more detailed object.

The very first version of Fez recorded all the trixels that are present inside a trile, which means an untouched trile would contain a list of all the possible positions inside the trile to say that these positions are filled.

It became obvious early on that most triles have a lot more matter than holes, so the second version recorded the missing trixels instead. But even that became hardly manageable, especially in the text-format/human readable intermediate trileset description files; they were still too immense to edit by hand if needed, and took a while to parse and load because of their size. The memory usage was questionably high too for the little data that was represented.

I tried using octrees, but that kinda failed too. Thankfully Saint on the tigsource forums gave me a better idea; to use missing trixel boxes, so the smallest amount of the largest possible missing trixels cuboids. The good thing about these is that a box is represented by a 3D vector for its size and a 3D point for its position, that’s it!

Here’s a visualization of what these boxes looks like. Adjacent boxes have the same color.

The editor tries to keep these boxes as big as possible and their number as small as possible in realtime while you edit, but my algorithms aren’t perfect yet. A “best effort” scenario is fine though, a couple of superfluous boxes have little effect on memory usage/file size.

Polygonization

The rendering strategy of Fez is to draw the bounding surfaces of each trile as a triangle list and ignore everything that’s inside a trile. To do that, we must first isolate the contiguous surfaces of the trile, and then split it in triangles.

To extract surfaces, I assume that all of the actions from the initial filled trile are incremental. By that I mean that all you can do in Fezzer when sculpting a trile is remove a trixel or add a trixel, and each of these operation creates, destroys or invalidates surfaces; but each is treated separately and acts on the current state.
…I was going to explain the algorithm in detail, but I feel like it’d be just confusing and unnecessary. So, exercise to the reader. :)

Whenever a surface is modified/created, another pass tries to find the smallest amount of the biggest possible rectangles inside that surface. It’s the same thing as with the missing trixel boxes, but in 2D this time (also alot easier to do right!). My algorithm traverses the surface from its center in an outward spiral and marks all the cells that form a rectangle; the remaining cells are traversed later, recursively until the whole surface is covered with rectangles.

Each rectangle is then a quad that is formed of two triangles, which we can render directly. A vertex pool makes sure that there are no duplicate vertices, and maps the indices appropriately.

Below is a visualization of the rectangular surface parts of the same trile, in filled geometry mode and in wireframe.

Mass rendering

So that’s a single trile. A level is formed of a ton of triles, we don’t want to draw everything at all times. How do we manage that?
And rendering a lot of small objects in sequence is not very GPU-friendly, so how do we group batches efficiently?

The answer to the first question is efficient culling. Since the world is in essence formed of grid-aligned tiles, it’s easy to find which are visible in the screen or not, and render selectively. But Fez also works in the third dimension, and the depth range can be really big, so we need to find which triles we can skip rendering if they’re behind another trile. Simple enough, traverse to the first visible trile for each screen-space tile, and render this one only. But some triles can be flagged as see-through and let the traversing continue until we hit a non-seethrough trile or the level’s boundaries.

For tilted isometric views, a similar culling algorithm is used, where we try to find the triles with no neighbours on the faces that the camera can see. In perspective view, it’s a lot harder to cull without occlusion queries, which I didn’t want to get into… But 99% of the game is played from an isometric perspective so it’s no big deal.

Here’s a scene from the GDC ’09 trailer in a 2D view that you’d play in, and how it’s culled. The world extends in the third dimension, but we only need to see its shell!

The second thing is batching.

In the very first XNA version of Fez, I tried to call DrawUserIndexedPrimitives repeatedly for each trile in the world and hope for the best. It was unplayable, because as it turns out there is considerable overhead to draw calls and doing fewer, bigger draw calls is the key to 3D performance.

Each level has a trile set that contains a restricted number of different trile templates, and the level indexes integer 3D grid positions with elements of that trile set. Trile instances have a very limited set of properties to themselves (like rotation and offset) but every instance of a template trile shares geometry, texture and collision information. So I felt that geometry instancing was the way to go for batch rendering.

Different hardware supports different flavours of instancing but shader instancing is the common baseline for all Shader Model 2 and above GPUs, so I went with that. My current implementation of SM2 instancing supports 237 instances per batch, and the instance information is stored as vertex shader constants. The number will probably go down if I need to add more information to individual instances, it’s really minimalistic right now. But I found instancing to provide excellent performance on older hardware, current-gen GPUs and consoles, and not too hard to get to work well.

That’s it! Hope you enjoyed the tour. :)
Any questions?

Fez Trailer Zwei

Here’s the new trailer we (Polytron) released last week at GDC, in case you missed it!

Fun fact about the trailer : it’s all realtime. The mini-levels all link to themselves, but to get maximum smoothness and music sync, we recorded the parts separately and edited them back together.

Behind Fez : Trixels (and why we don’t just say voxels)

A follow-up with much greater detail to this post can be found here.

Alright, here’s a couple of explainations about the rendering technology behind Fez, what we call trixels.

Some people on deviantART and the TIGS blog post have pointed out how these are pretty much just voxels, but with a trendy name. As the lead programmer, I beg to differ… a bit.

First, everything is rendered 3D, at all times. The 2D views are just orthographic (a.k.a. isometric) views of the world from a direction or another. Since the Z component disappears, the character considers the whole world as 2D and can go from very far objects to very close objects without distinction.

Each visible pixel-art tile that you see while playing the game in 2D view is part of a 3D cube, which we call a trile. Each trile is a 16x16x16 volume which is formed of 4096 potential trixels. Obviously, not all trixels are rendered, else it would be incredibly slow… so only the border trixels are considered. But in the data storage, it’s basically a 3D presence array which tells the renderer if a trixel is present/on, or absent/off.

Up to now, I could’ve called them voxels and it wouldn’t have made any difference… but when it comes to rendering, we want every 2D side of the trile to look like believable pixel art, so it needs to be made of smaller cubes. Standard voxel triangulation is complicated because it wants to look as close to the initial (curved, organic) shape as possible… but we don’t! We want that pixelated, 8-bit look.

So we make assumptions. And that allows very intuitive polygon reduction and culling algorithms, allowing pretty good detail on these triles.

As for the texturing, cubemaps are used, which links trixels to pixels even more. Each pixel of the cubemap (so each visible 2D pixel) ends up as trixel of the trile.

So there you have it. Trixels are voxels, but with some special properties, a special (simpler) triangulation algorithm, and in a pretty special game. :)

The pretty pictures :

1st : Pretty ugly trile with sculpted trixels in “fezzer”, the game content editor.
2nd : Wireframe version of that trile, computed in realtime; notice how little polygons are needed and used.
3rd : A scene, rendered with the game engine.

Fez teaser trailer!

Let’s get awesome!

Fez is finally ready.

I’ve been working (my ass off and compromising my job and social life) on this game called Fez for a couple of months, and 5 hours from the Independent Games Festival entry submission deadline, it’s finally OVER!

Well, the demo. But it’s a full level, with dialogues, collectibles, sound, music and a pretty full demonstration of the game’s concept… which I can’t show too much of right now, especially the concept and what’s original about it, but here’s some in-game material that I can release. It’s a screenshot, taken directly from the game, no mocking-up here.

Continue reading Fez is finally ready.