Gaussian Blur Experiments

A follow-up to this article with clarifications and corrections to the “real-world considerations” can be found here.

I researched gaussian blur while trying to smooth my Variance Shadow Maps (for the Shadow Mapping sample) and made a pretty handy reference that some might like… I figure I’d post it for my first “Tips” blog post. :)

The full article contains a TV3D 6.5 sample with optimized Gaussian Blur and Downsampling shaders, and shows how to use them properly in TV3D. The article also contains an Excel reference sheet on how to calculate gaussian weights.

Update : I added a section about tap weight maximization (which gives an equal luminance to all blur modes) and optimal standard deviation calculation.

Continue reading Gaussian Blur Experiments

Shadow Mapping Experiments

I cleaned up this entry because it’s still getting traffic and didn’t really make sense as it was. I am not working on this anymore, haven’t been for months.

Description

My goal with this sample is to make a fast, simple, easy-to-use and good-looking (in other words, utopic :P) shadow mapping implementation in TV3D 6.5 with a directional light. It should work on landscape and meshes, perhaps on actors too but I don’t know how to rewrite an actor shader (animation blending = ouch).

I have already released a VB.Net implementation of Landscape Shadow Mapping on the TV3D forums. I might rewrite it in C# and my current programming standards and release it on my blog, but for now I’m not satisfied with the code enough to give it publicity…

Implementation Details

There are currently two modes :

  1. The PCF (Percentage-Closer Filtering) implementation uses a R32F texture for the depth map, which is the fastest format for what’s needed. I hope this is renderable on ATi and ps_2_0 hardware, I’ll have to check on this.
    I read on a couple of sites that on nVidia hardware, using D24X8/D24S8 textures gave free 4×4 PCF with bilinear filtering… but damnit, I can’t create a rendersurface with this mode! Sylvain said he’d look into this.
  2. The VSM (Variance Shadow Maps) implementation uses a A16B16G16R16F (aka HDR_FLOAT_16) texture for the depth map, because it allows for filtering on my GeForce 7000 series hardware :)
    I considered using G16R16F, which is faster and has a smaller memory footprint, but looks like TV3D won’t allow me to use it as a render target. G32R32F and A32B32G32R32F work and are more precise, but they don’t allow filtering; so it looks like unfiltered PCF, unless it’s post-bilinear-filtered (which I do support).

For the depth map rendering, I use two cameras : one for the actual viewport, and an orthogonal (aka isometric) one for the light. Its look-at vector is the same as light direction, and its zoom and position are calculated using bounding boxes vector projection. Then when rendering the depth map in the shader, I use the WORLDVIEWPROJECTION semantic, which gives me exactly what I want. I found that to be the cleanest way.

Comparison Screenshots

Now here is the promised ton of screenshots.

  1. No filtering
  2. No filtering (downsampled 4x)
  3. 3×3 PCF (Percentage-Closer Filtering)

  4. 3×3 PCF with Jitter Sampling

  5. 3×3 PCF with Bilinear Filtering
  6. 4×4 PCF

  7. 4×4 PCF with Jitter Sampling
  8. 4×4 PCF with Bilinear Filtering
  9. 4×4 PCF with Bilinear Filtering (Downsampled 4x)

  10. 7×7 Gaussian Blurred 16-bit VSM (Variance Shadow Maps)
  11. 5×5 Gaussian Blurred 16-bit VSM (Downsampled 4x)

  12. 5×5 Gaussian Blurred 32-bit VSM (Downsampled 4x)

Notice the framerate, it’s a fairly good indicator of the performance of each technique for a very small frame buffer resolution. See the table below for a “real-world” performance evaluation.

About Downsampling, it looks like it has a pretty hard effect on the framerate, but the quality is comparable to a 4x resolution shadowmap. And, something that can’t be seen in the above screenshots, the framerate is much more constant when the viewport is larger. I’ll make a proper benchmark to show how those techniques compare in the real world.

A word on VSM, the last shot. It’s very fast, it’s smooth, but… it looks odd.
Notice the gradient, as if light was bleeding from underneath the wall. And the separation between the armchair shadow and the wall shadow! Those are some of the artifacts I was talking about. The main problem is precision; I’m using a 16-bit floating-point format because it gives free hardware filtering (up to anisotropic), but it also causes precision issues.
32-bit versus 16-bit might look identical, but there are other artifacts in the scene like on straight walls or on the floor that are present in the 16-bit version but not in the 32-bit version. The shadow also looks somewhat sharper.

Now about jitter sampling. After seeing that, one would say “Jitter sampling is useless! Bilinear filtering looks tons better.”
But jitter sampling has an advantage that bilinear does not have : it can be scaled. Here are some more shots.

  1. 4×4 PCF with Scaled Jitter Sampling, 4x resolution shadowmap

  2. 4×4 PCF with Bilinear Filtering, 4x resolution shadowmap

Here, bilinear filtering just looks like antialiasing. It looks kinda good, but the shadows are very hard. Scaled jitter sampling on the other hand, has very soft shadows! And they’re relatively faster, for the same shadowmap size.

Mesh Self-Shadowmapping Performance

I’ve decided to make a 1024×768 fullscreen benchmark of all good-looking techniques. Here are the results :

Depth map sizeTechniqueExtensionsFramerate
512×512PCF 4×4Bilinear Filtering111 fps
1024×1024PCF 4×4Bilinear Filtering106 fps
1024×1024PCF 4×4Downsampled 4x, Bilinear Filtering97 fps
2048×2048PCF 4×4Jittered122 fps
512×512VSM 5×516-bit, Hardware Filtered210 fps
512×512VSM 5×532-bit, Bilinear Filtered117 fps
1024×1024VSM 7×716-bit, Hardware Filtered75 fps
1024×1024VSM 5×5Downsampled 4x, 16-bit, Hardware Filtered169 fps
1024×1024VSM 5×5Downsampled 4x, 32-bit, Bilinear Filtered95 fps

All in all, those are comparable techniques in terms of visual quality. What I take from these results :

  • PCF is viewport size-dependant, not depth map size-dependant; so getting better visual quality by pumping up the depth map size is possible. On the other hand, downsampling to gain performance is futile, and doesn’t help the visual quality much either.
  • Jittered sampling is very fast and works well with very high-resolution depth maps.
  • VSM is very much depth map size-dependant, which means that it benifits alot from downsampling.
  • 16-bit VSM on nVidia hardware is VERY fast, but when you go earlier than the 6000 series or on ATi hardware, you have to bilinear-filter it yourself… and it’s slow. Also, 16-bit has a lot of precision issues, I have to work on those…
  • 32-bit VSM is slow, but usable when the depth map is kept pretty small (512×512). It also looks very good.

Landscape Shadowmapping Notes

Some remarks though about the choices I made for filtering in a multi-pass context :

  • VSM is surprisingly slow in a multi-pass algorithm, so barely applicable to landscape shadowmapping. The precision artifacts just aren’t worth it.
  • Unfiltered modes look surprisingly good because the shadowmap itself is hardware-filtered so it doesn’t need to be constructed smooth. That renders useless all the work I’ve put in making bilinear filtering possible at every level! I find 512×512 unfiltered to look great for “semi-hard shadows”.
  • Jitter sampling looks great, has the advantage of being costless at high-resolution shadowmaps, and gets filtered — something that was impossible to do without multi-pass. It’s the clear winner here in the “soft shadows” category.

Bonus (jittered) teapot. :)

Soft Stencil Shadows

Downloads

SoftStencils.rar [776Kb] – C# 2.0 (VS.NET 2005)

Description

This demo demonstrates a technique that fellow forum user kTech proposed in the Beta Discussion forum to smooth otherwise hard stencil shadows. It uses screen-space blur of the shadow pass to smoothen things out.

Continue reading Soft Stencil Shadows

Double-Sided Bumped Refraction

Downloads

Both downloads are Visual Basic.NET 2005 projects.

Refraction.rar [408Kb] – Simpler first version
Refraction_v2.rar [665Kb] – Second version, now with backface rendering (double-sided refraction) and more test models

Description

A specular-capable, normal-mapped, double-sided refraction shader that can be applied to pretty much any TVMesh, originally asked by forum user WEst.

Continue reading Double-Sided Bumped Refraction

Non-Reflective Water

Downloads

NonReflectiveWater.rar [7.9Mb] – C# 2.0 (VS.NET 2005)

Description

I had a request from a MMORPG developer to make a lightweight water shader that doesn’t need a reflection rendersurface, yet looks acceptable.
It looks nowhere as shiny as the reflective water shader or even TV’s built-in water, but it runs a zillion times faster because it doesn’t do any parallaxing, nor specular bumpmapping, nor perspective projection… Its implementation is alot simpler as well.

Continue reading Non-Reflective Water

Landscape Height-Based Coloring

Downloads

HeightBasedLandscape.rar [152Kb] – C# 2.0 (VS.NET 2005)

Description

A über-simple landscape shader that maps a color ramp to a landscape’s height. Basically, it demonstrates that shaders on landscapes is possible, and can be a nice addition to a visual landscape editor.

Continue reading Landscape Height-Based Coloring

The Blue Planet

Downloads

HLSLBluePlanet.rar [8.9Mb] – C# 2.0 (VS.NET 2005)

Description

This demo is the remake of an old 6.2 demo I had made to test materials and lighting. I was always decieved by that lack of proper bump-mapping in 6.2… so I remade it in 6.5 with custom shaders, very high-resolution textures and normal-maps, and even the moon!

Continue reading The Blue Planet

AGT’s Third Person Demo – C# 2.0 Port

Downloads

AGT3rdPersonDemo.rar [10Mb]

Description

This demo is a port of AGT’s Third Person Demo that was posted on the TV3D Beta Discussion forum. The original version was VB6, I just took the same media and code structure and adapted it to C# 2.0 (Visual Studio.NET 2005).

Continue reading AGT’s Third Person Demo – C# 2.0 Port