Thursday, November 7, 2013

Visual Studio 2013 / Demo Skeleton Programming

I updated my demo skeleton in the google code repository. It is using now Visual Studio 2013, that now partially supports C99 and therefore can compile the code. I updated the compute shader code a bit and I upgraded Crinkler to version 1.4. The compute shader example now also compiles the shader into a header file and then Crinkler compresses this file as part of the data compression. It packs now overall to 2,955 bytes.

https://code.google.com/p/graphicsdemoskeleton/

If you have fun with this code, let me know ... :-)

Monday, September 30, 2013

Call for a new Post-Processing Pipeline - KGC 2013 talk

This is the text version of my talk at KGC 2013.
The main motivation for the talk was the idea of looking for fundamental changes that can bring a modern Post-Processing Pipeline to the next level.
Let's look first into the short history of Post-Processing Pipelines, where we are in the moment and where we might be going in the near future.

History
Probably one of the first Post-Processing Pipelines appeared in the DirectX SDK around 2004. It was a first attempt to implement HDR rendering. I believe from there on we called a collection of image space effects at the end of the rendering pipeline Post-Processing pipeline. 
The idea was to re-use resources like render targets and data with as many image space effects as possible in a Post-Processing Pipeline. 
A typical collection of screen-space effects were
  • Tone-mapping + HDR rendering: the tone-mapper can be considered a dynamic contrast operator 
  • Camera effects like Depth of Field with shaped Bokeh, Motion Blur, lens flare etc..
  • Full-screen color filters like contrast, saturation, color additions and multiplications etc..
One of the first coverages of a whole collection of effects in a Post-Processing Pipeline running on XBOX 360 / PS3 was done in [Engel2007].
Since then numerous new tone mapping operators were introduced [Day2012], new more advanced Depth of Field algorithms with shaped Bokeh were covered but there was no fundamental change to the concept of the pipeline.


Call for a new Post-Processing Pipeline
Let's start with the color space: RGB is not a good color space for a post-processing pipeline. It is well known that luminance variety is more important than color variety, so it makes sense to pick a color space that has luminance in one of the channels. With the 11:11:10 render targets it would be cool to store luminance in one of the 11 bit channels. Having luminance available in the pipeline without having to go through color conversions opens up many new possibilities, from which we will cover a few below.

Global tone mapping operators didn't work out well in practice. We looked at numerous engines in the last four years and a common decision by artists was to limit the luminance values by clamping them. The reasons for this were partially in the fact that the textures didn't provide enough quality to survive a "light adaptation" without blowing out or sometimes most of their resolution was in the low-end greyscale values and there wasn't just enough resolution to mimic light adaptations. 
Another reason for this limitation was that the available resolution in the rendering pipeline with the RGB color space was not enough. Another reason for this limitation is the fact that we limited ourselves to Global tone mapping operators, because local tone mapping operators are considered too expensive.

A fixed global gamma adjustment at the end of the pipeline is partially doing "the same thing" as the tone mapping operator. It applies a contrast and might counteract the activities that the tone-mapper already does. 
So the combination of a tone-mapping operator and then the commonly used hardware gamma correction, which are both global is odd.

On a lighter note, a new Post-Processing Pipeline can add more stages. In the last couple of years, screen-space ambient occlusion, screen-space skin and screen-space reflections for dynamic objects became popular. Adding those to the Post-Processing Pipeline by trying to re-use existing resources need to be considered in the architecture of the pipeline.

Last, one of the best targets for the new compute capabilities of GPUs is the Post-Processing Pipeline. Saving memory bandwidth by merging "render target blits" and re-factoring blur kernels for thread group shared memory or GSM are considerations not further covered in the following text; but most obvious design decisions.

Let's start by looking at the an old Post-Processing Pipeline design. This is an overview I used in 2007:

A Post-Processing Pipeline Overview from 2007

A few notes on this pipeline. The tone mapping operation happens at two places. At the "final" stage for tone-mapping the final result and in the bright-pass filter for tone mapping the values before they can be considered "bright". 
The "right" way to apply tone mapping independent of the tone mapping operator you choose is to convert into a color space that exposes luminance, apply the tone mapper to luminance and then convert back to RGB. In other words: you had to convert between RGB and a different color space back and forth twice. 
In some pipelines, it was decided that this is a bit much and the tone mapper was applied to the RGB value directly. Tone mapping a RGB value with a luminance contrast operator led to "interesting" results.
Obviously this overview doesn't cover the latest Depth of Field effects with shaped Bokeh and separated near and far field Center of Confusion calculations, nevertheless it shows already a large amount of render-target to render-target blits that can be merged with compute support.

All modern rendering pipelines calculate color values in linear space; meaning every texture that is loaded is converted into linear space by the hardware, then all the color operations are applied like lighting and shadowing, post-processing and then at the end the color values are converted back by applying the gamma curve.
This separate Gamma Control is located at the end of the pipeline, situated after tone mapping and color filters. This is because the GPU hardware can apply a global gamma correction to the image after everything is rendered. 

The following paragraphs will cover some of the ideas we had to improve a Post-Processing Pipeline on a fundamental level. We implemented them into our Post-Processing Pipeline PixelPuzzle. Some of the research activities like finally replacing the "global tone mapping concept" with a better way of calculating contrast and color will have to wait for a future column.

Yxy Color Space
The first step to change a Post-Processing Pipeline in a fundamental way is to switch it to a different color space. Instead of running it in RGB we decided to use CIE Yxy through the whole pipeline. That means we convert RGB into Yxy at the beginning of the pipeline and convert back to RGB at the end. In-between all operations run on Yxy.
With CIE Yxy, the Y channel holds the luminance value. With a 11:11:10 render target, the Y channel will have 11 bits of resolution.

Instead of converting RGB to Yxy and back each time for the final tone mapping and the bright-pass stage, running the whole pipeline in Yxy means that this conversion might be only done once to Yxy and once or twice back to RGB.
Tone mapping then still happens with the Y channel in the same way it happened before. Confetti's PostFX pipeline offers eight different tone mapping operators and each of them works well in this setup.
Now one side effect of using Yxy is also that you can run the bright-pass filter as a one channel operation, which saves on modern scalar GPUs some cycles.

One other thing that Yxy allows to do is to consider the occlusion term in Screen-Space Ambient Occlusion as a member of the Y channel. So you can mix in this term and use it in interesting ways. Similar ideas apply to any other occlusion term that your pipeline might be able to use.
The choice of using CIE Yxy as the color space of choice was arbitrary. In 2007 I evaluated several different color spaces and we ended up with Yxy at the time. Here is my old table:

Pick a Color Space Table from 2007

Compared to CIE Yxy, HSV doesn't allow easily to run a blur filter kernel. The target was to leave the pipeline as unchanged as possible when picking a color space. So with Yxy, all the common Depth of Field algorithms and any other blur kernel runs unchanged in Yxy. HSV conversions also seem to be more expensive compared to RGB -> CIE XYZ -> CIE Yxy and vice versa.
There might be other color spaces similar tailored to the task.


Dynamic Local Gamma
As mentioned above, the fact that we apply a tone mapping operator and then later on a global gamma operator appears to be a bit odd. Here is what the hardware is supposed to do when it applies the gamma "correction".

Gamma Correction
The main take-away from this curve is that the same curve is applied to every pixel on screen. In other words: this curve shows an emphasis on dark areas independently of the pixel being very bright or very dark.
Whatever curve the tone-mapper will apply, the gamma correction might be counteracting it.

It appears to be a better idea to move the gamma correction closer to the tone mapper, making it part of the tone mapper and at the same time apply gamma locally per pixel.
In fact gamma correction is considered depending on the light adaptation level of the human visual system. The "gamma correction" that is applied by the eye changes the perceived luminance based on the eye's adapatation level [Bartleson 1967] [Kwon 2011].
When the eye is adapted to dark lighting conditions, the exponent for the gamma correction is supposed to increase. If the eye is adapted to bright lighting conditions, the exponent for the gamma correction is supposed to decrease. This is shown in the following image taken from [Bartleson 1967]:
Changes in Relative Brightness Contrast [Bartleson 1967]

A local gamma value can vary with the eye's adaptation level. The equation that adjusts the gamma correction following the current adaptation level of the eye can be found in [Kwon 2011].
γv=0.444+0.045 ln(Lan+0.6034)
For this presentation, this equation was taken from the paper by Kwon et all. Depending on the type of game there is an opportunity to build your own local gamma operator. 
The input luminance value is generated by the tone mapping operator and then stored in the Y channel of the Yxy color space:
YYxy=Lγv
γv changes based on the luminance value of the current pixel. That means each pixels luminance value might be gamma corrected with a different exponent. For the equation above, the exponent value is in the range of 0.421 to 0.465. 
Applied Gamma Curve per-pixel based on luminance of pixel
Eye’s adaptation == low - >blue curve
Eye’s adaptation value == high -> green curve
Lγv
works with any tone mapping operator. L is the luminance value coming from the tone mapping operator. 
With a dynamic local gamma value, the dynamic lighting and shadowing information that is introduced in the pipeline will be considered for the gamma correction. The changes when going from bright areas to dark areas appear more natural. Textures are holding up better the challenges of light adaptation. Overall lights and shadows look better.


Depth of Field
As a proof-of-concept of the usage of Yxy color space and the local dynamic gamma correction, this section is showing screen-shots of a modern Depth of Field implementation with separated near and far field calculations and a shaped Bokeh, implemented in compute.

Producing an image through a lens leads to a "spot" that will vary in size depending on the position of the original point in the scene:
Circle of Confusion (image taken from Wikipedia) 


The Depth of Field is the region, where the CoC is less than the resolution of the human eye (or in our case the resolution of our display medium). The equation on how to calculate the CoC [Potmesil1981] is:


Following the variables in this equation, Confetti demonstrated in a demo at GDC 2011 [Alling2011] the following controls:
  • F-stop - ratio of focal length to aperture size
  • Focal length – distance from lens to image in focus
  • Focus distance – distance to plane in focus
Because the CoC is negative for far field and positive for near field calculations, separate results are commonly generated for the near field and far field of the effect [Sousa13].
Usually the calculation of the CoC is done for each pixel in a down-sampled buffer or texture. Then the near and far field results are generated. Then, first, the far and focus field results are combined and then this result is combined with the near field, based on a near field coverage value. The following screenshots show the result of those steps, with the first screenshot showing the near and far field calculations:

Red = max CoC(near field CoC)
Green = min CoC(far field CoC)

Here is a screenshot of the far field result in Yxy:

Far field result in Yxy

Here is a screenshot of the near field result in Yxy:
Near field result in Yxy

Here is a screenshot of resulting image after it was converted back to RGB:
Resulting Image in RGB

Conclusion
A modern Post-Processing Pipeline can benefit greatly from being run in a color space that offers a separable luminance channel. This opens up new opportunities for an efficient implementation of many new effects.
With the long-term goal of removing any global tone mapping from the pipeline, a dynamic local gamma control can offer more intelligent gamma control that is per-pixel and offers a stronger contrast of bright and dark areas, considering all the dynamic additions in the pipeline.
Any future development in the area of Post-Processing Pipelines can be focused on a more intelligent luminance and color harmonization.



References
[Alling2011] Michael Alling, "Post-Processing Pipeline", http://www.conffx.com/GDC2011.zip
[Bartleson 1967] C. J. Bartleson and E. J. Breneman, “Brightness function: Effects of adaptation,” J. Opt. Soc. Am., vol. 57, pp. 953-957, 1967.
[Day2012] Mike Day, “An efficient and user-friendly tone mapping operator”, http://www.insomniacgames.com/mike-day-an-efficient-and-user-friendly-tone-mapping-operator/
[Engel2007] Wolfgang Engel, “Post-Processing Pipeline”, GDC 2007 http://www.coretechniques.info/index_2007.html
[Kwon 2011] Hyuk-Ju Kwon, Sung-Hak Lee, Seok-Min Chae, Kyu-Ik Sohng, “Tone Mapping Algorithm for Luminance Separated HDR Rendering Based on Visual Brightness Function”, online at http://world-comp.org/p2012/IPC3874.pdf
[Potmesil1981] Potmesil M., Chakravarty I. “Synthetic Image Generation with a Lens and Aperture Camera Model”, 1981
[Reinhard] Erik Reinhard, Michael Stark, Peter Shirley, James Ferwerda, "Photographic Tone Reproduction for Digital Images", http://www.cs.utah.edu/~reinhard/cdrom/
 [Sousa13] Tiago Sousa, "CryEngine 3 Graphics Gems", SIGGRAPH 2013, http://www.crytek.com/cryengine/presentations/cryengine-3-graphic-gems

Friday, September 13, 2013

KGC 2013

I will be a speaker on the Korean Game Developer Conference this year. This is my third time and I am very much enjoying it.
This year I want to talk about building a next-gen Post-Processing Pipeline. Most people haven't change their PostFX pipeline algorithms since 6 or 7 years ( ... no re-writing it in compute doesn't count ... also replacing your Reinhard operator with an approx. Hable operator: check out Insomniac's website doesn't count either :-) ).

Please come by and say hi if you are around.

Monday, July 29, 2013

TressFX - Crystal Dynamics and AMD cover TressFX on SIGGRAPH

There were more talks about Confetti's work on TressFX on SIGGRAPH: One talk by Jason Lacroix was: "Adding More Life to Your Characters With TressFX".

Activision's head demo uses TressFX as well: "Digital Ira: High-Resolution Facial Performance Playback".

If you are a registered developer and you need XBOX One or PS4 implementations, send me an e-mail.

Thursday, July 25, 2013

SIGGRAPH 2013

I would like to highlight the talk "Crafting a Next-Gen Material Pipeline for The Order: 1886":

http://blog.selfshadow.com/publications/s2013-shading-course

The 3D Fabric Scanner is a fantastic idea and the results are awesome. Those are next-gen characters. Great work!

Monday, July 22, 2013

Tiled Resources / Partially Resident Textures / MegaTextures

One of the new features of DirectX 11.2 and now OpenGL 4.4 is Tiled Resources. Tiled Resources allow to manage one large texture in "hardware" tiles and implement a megatexture approach. The advantage of using the hardware for this compared to the software solution that was used before are:
- no dependent texture read necessary
- hardware filtering works including anisotropic filtering
AMD offers an OpenGL extension for this as well and it is available on all newer AMD GPUs. NVIDIA has shown it running on the build conference on DirectX 11.2. So there is a high chance that it is available on a large part of the console and PC market soon.
Let's step one step back and see what challenge a Megatexture is supposed to solve. In Open World games, we solve the challenge of having a high detail in textures with two techniques:
- on-going texture streaming: on a console you keep streaming from physical media all the time. This requires careful preparation of the layout of the physical media and a multi-core/multi-threaded texture streaming pipeline with -for example- priority queues.
- procedural generation of "large" textures: generating a large terrain texture is best done by generating it on the fly. That means stitching a "large" texture together out of smaller textures with one "control texture" that then also requires a dependent texture read.
The advantage of procedural texture generation is that it doesn't require a lot of "streaming" memory bandwidth, while one large texture or also many small textures eat into the amount of available "streaming" memory bandwidth.
Now with a MegaTexture there is the ability to store much more details in the large texture but it comes with the streaming cost. If you have an implementation that doesn't generate the terrain texture procedurally on the fly and you have to stream the terrain data, than the streaming cost might be similar to your current solution, so the MegaTexture might be a win here.
The biggest drawback of Partially Resident Textures / MegaTextures seems to be forgotten in the articles that I have seen so far: someone has to generate them. There might need to be an artists who fills a very large texture with a high amount of detail pixel-by-pixel. To relieve the workload, a technique that is called "Stamping" is used. As the name implies a kind of "stamp" is applied at several places onto the texture. Stamping also means giving up the opportunity to create unique pixels everywhere. In other words the main advantage of a MegaTexture, offering a huge amount of detail, is counteracted by stamping.
In practice this might lead to a situation where your MegaTexture doesn't hold much detail because artists would have to work a long time to add detail and this would be too expensive. Instead the level of detail that is applied to the texture is reduced to an economically feasible amount.
The overall scenario changes, when data exists that -for example- is generated from satellite images of the earth with high resolution. In that case a MegaTexture solution will offer the best possible quality with less art effort and you can build a workflow that directly gets the pre-generated data and brings it into your preferred format and layout.
For many game teams, the usage of MegaTextures will be too expensive. They can't afford the art time to generate the texture in case they can't rely on existing data.






Tuesday, July 2, 2013

Link Collection

I was looking through some of the links I saved for further reading today.

An article explaining BC compression formats with a lot of detail and clarity can be found here:

Understanding BCn Texture Compression Formats

There is an interesting blog post by Sebastien Sylvan. He writes about R trees a data structure that allows you for example to do spatially indexing of objects in your game.

A Random Walk Through Geek-Space

He also has other cool articles on hash maps and vector replacmenents.

We still need desktop PCs in the office to swap discrete GPUs whenever we need to. Because we also need them as portable as possible, we decided to build the following setup ourselves:

Maximum PC

So far we build two and they work well.

For Blackfoot Blade, we worked with a composer in Finland. I love the music he made and I wanted to share his website here:

TAPANI SIIRTOLA

Our friends at Bitsquid released a useful open-source library:

foundation

I quote from the description that describes the design of the library:

Library Design

foundation has been written with data-oriented programming in mind (POD data is preferred over complicated classes, flat arrays are the preferred data structure, etc). foundation is written in a "back-to-C" style of C++ programming. This means that there is a clear separation between data and code. Data defitions are found in _types.h header files. Function definitions are found in .h header files.

If you haven't found the DirectXTex texture library you need to check it out at

DirectXTex

MVP Award 2013

Yesterday Microsoft awarded me with a MVP award for Visual C++. Now that DirectX is part of Visual C++, I was moved into the Visual C++ category. I am super proud of that. Especially now that Visual C++ finally gets C99 support :-)

Sunday, June 30, 2013

Google Console / Visual Studio 2013 will support C99

Google making a console is an interesting news item. Like Apple they can utilize standard mobile phone parts and extend Android to support controllers.
What does it take to make this work:

1. High-end good looking apps: there is no need to have a fallback rendering path, so you can optimize until the last cycle
2. Dedicated section in the app store to highlight the controller-capable apps
3. The NDK needs to be better supported: I mentioned it here in the past, it is good that the NDK exists. This is the most important basic requirement to get existing tech to Android phones ...
4. A good controller with good support goes a long way ...

In other news, Visual Studio 2013 will finally support C99. This is something I always wished for, not only because C99 is a perfect game development language and mighty portable but also because open-source projects quite often favor C99 ... so now we can finally move our code base from C++ to C99 and it will still compile in a C++ environment like Visual Studio. For people who actually write engine code that is cross-platform or shared between teams, this is good news ...

http://arstechnica.com/information-technology/2013/06/c99-acknowledged-at-last-as-microsoft-lays-out-its-path-to-c14/


Monday, June 24, 2013

Lighting a Game / Lighting Artists / Physically / Observational Lighting models / Bounce Lighting

Here is a way on how modern game engines can light scenes. I was just describing this in a forum post: the idea is following what the CG movie industry is doing. Placing real-time lights happens similar to CG movies. In CG movies we have thousands of lights and in games we have now dozens or most of the time 100+ lights in scenes. Compared to switching from observational to physical lighting models, this makes the biggest difference. Then each of those lights can also cast bounce lighting which is another switch for the artist. So in essence artists can place lots of lights, switch on / off shadows on each light and switch on / off real-time GI on each light. The light / shadow part was already possible on XBOX 360 / PS3 but now on the next-gen consoles we also have bounce lighting per light. That gives lighting artists a wide range of options to light scenes. A lighting setup like this would be overkill in a game like Blackfoot Blade, where you fly in a helicopter high above ground. We have only a few dozen real-time lights on screen without shadows (each rocket casts a light, the machine gun of the helicopter, even the projectiles from the tanks and the flair). The game runs also on tablets. With any ground based game like an open-world game, lots of lights makes a huge difference to light corners and the environment. It is one of those "better than real" options that lighting artists like. My point is comparing the switch from observational to physically based lighting models with switching from a few lights to lots of lights. The later gives you much more "bang for the buck", so you want to do this first. Any scene in shadows will not look much better with a physically based lighting model if you only use one or a few light sources but with lots of lights you can make it look "better than real".

Sunday, May 19, 2013

Re-cap of Deferred Lighting

This is part of a response in a discussion forum I did recently and it summarizes a lot of the things I did a few years ago with renderer design on XBOX 360 / PS3 (more details in the articles below). So I thought I share it here as well:

----------------------------------------------------
Before I start talking about memory bandwidth, let me first outline my favorite renderer design that shipped now in a couple of games: from a high-level view, you want to end up with mainly three stages:

Geometry -> Light / Shadow data -> Materials

Geometry
In this stage you render your objects into a G-Buffer. It might be the most expensive or one of the more expensive parts of your rendering pipeline. You want your G-Buffer to be as small as possible. When we moved from Deferred Shading to Deferred Lighting, one of the motivations was to decrease the size of the G-Buffer. In a typical Deferred Lighting scenario you have three render targets in the G-Buffer: the depth buffer, a color buffer and a normal buffer (those might hold specular information and a material index as well). There are all kinds of ways to compress data like using two channel color formats, two channel normal data etc..
One reason why the G-Buffer needs to be small is what I mentioned above. Every mesh you render in there will be expensive. Obviously I am leaving out a lot of stuff here like pre-Z pass etc..
The other reason why the G-Buffer needed to be small was the fact that you have to read it for each light. On XBOX 360 and PS3 that was a major memory bandwidth challenge and the ultimate reason to move from Deferred Shading to Deferred Lighting. You were able to render now much more lights with the smaller G-Buffer.

Lighting / Shadow Stage
Rendering lights into the light / shadow buffer had the advantage of just using two render targets out of the three in the G-Buffer. The depth buffer and the normal buffer with the specular information. With that setup you could increase the number of lights substantially compared to Deferred Shading. 
The light / shadow buffer holds the data for all lights and shadows, in other words: brightness data separated for diffuse and specular together with the light color. Please note the third render target in the G-Buffer that holds color is not used here.

Material Stage
Splitting up the high-level view into Geometry -> Light / Shadows -> Materials is done because you want to apply expensive materials like skin, hair, cloth, leather, car paint etc.. and you can't store much data in the G-Buffer to describe those materials. So you apply them in screen or image space like a PostFX.
One of the reasons to move to Deferred Lighting was the increased material variety it offers. In a Deferred Shading setup you have to apply the material terms while you do the lighting calculations which sometimes made those really expensive and with the overlapping lights, you did those too often. 
---------------------------------------------------- 

A lot of the recent work is about materials in screen-space. In the last few years my focus moved away from renderer design to global illumination and re-thinking the current Post-Processing pipelines; while solving other challenges for our customers. I hope I have something to share in those areas very soon ...


Update of the Link Section

I added the blogs of Angelo Pesce, Fabian Giesen, Christian Schueler, Ignacio Castaño, Morten Mikkelsen and Sebastien Lagarde to the link list on the right.

Tuesday, May 14, 2013

GPU Programming at UCSD

My first outline for a new GPU Programming class at UCSD. Let me know what you think:


First Class
Overview
-- DirectX 11.1 Graphics
-- DirectX 11.1 Compute
-- Tools of the Trade - how to setup your development system

Introduction to DirectX 11.1 Compute
-- Advantages
-- Memory Model
-- Threading Model
-- DirectX 10.x support


Second Class
Simple Compute Case Studies
- PostFX Color Filters
- PostFX Parallel Reduction
- DirectX 11 Mandelbrot
- DirectX 10 Mandelbrot


Third Class
DirectCompute performance optimization
- Histogram optimization case study


Fourth Class
Direct3D 11.1 Graphics Pipeline Part 1
- Direct3D 9 vs. Direct3D 11
- Direct3D 11 vs. Direct3D 11.1
- Resources (typeless memory arrays)
- Resource Views
- Resources Access Intention
- State Objects
- Pipeline Stages
-- Input Assembler
-- Vertex Shader
-- Tessellation
-- Geometry Shader
-- Stream Out
-- Setup / Rasterizer
-- Pixel Shader
-- Output Merger
-- Video en- / decoder access


Fifth Class
Direct3D 11.1 Graphics Pipeline Part 2
-- HLSL
--- Keywords
--- Basic Data Types
--- Vector Data Types
--- Swizzling
--- Write Masks
--- Matrices
--- Type Casting
--- SamplerState
--- Texture Objects
--- Intrinsics
--- Flow Control
-- Case Study: implementing Blinn-Phong lighting with DirectX 11.1
--- Physcially / Observational Lighting Models
--- Local / Global Lighting
--- Lighting Implementation
---- Ambient
---- Diffuse
---- Specular
---- Normal Mapping
---- Self-Shadowing
---- Point Light
---- Spot Light


Sixth Class
Physically Based Lighting
- Normalized Blinn-Phong Lighting Model
- Cook-Torrance Reflectance Model


Seventh Class
Deferred Lighting, AA
- Rendering Many Lights History
- Light Pre-Pass (LPP)
- LPP Implementation
- Efficient Light rendering on DX 9, 10, 11
- Balance Quality / Performance
- MSAA Implementation on DX 10.0, 10.1, XBOX 360, 11
Screen-Space Materials
- Skin


Eight Class
Shadows
- The Shadow Map Basics
- “Attaching” a Shadow Map frustum around a view frustum
- Multi-Frustum Shadow Maps
- Cascaded Shadow Maps (CSM) : Splitting up the View
- CSM Challenges
- Cube Shadow Maps
- Softening the Penumbra
- Soft Shadow Maps


Nineth Class
Order-Independent Transparency
- Depth Peeling
- Reverse Depth Peeling
- Per-Pixel Linked Lists


Tenth Class
Global Illumination Algorithms in Games
- Requirement for Real-Time GI
- Ambient Cubes
- Diffuse Cube Mapping
- Screen-Space Ambient Occlusion
- Screen-Space Global Illumination
- Reflective Shadow Maps
- Splatting Indirect Illumination (SII)

GPU Pro 4 available on Amazon.com

Here is the latest GPU Pro 4 book.

I am helping to create those books now since 12 years.

For GPU Pro 5 we will have a huge amount of mobile devices techniques. Many GPU vendors now put in extensions to allow more modern stuff to happen on mobile devices .. like Deferred Lighting. Overall lots of stuff happening ... I am always surprised about the amount of innovation just happening in rendering in a year. With OpenGL ES 3.0 and the new extensions, we have lots of opportunities to make beautiful looking games. You can port a XBOX 360 game to this platform easily. If you want to contribute an article to a book, drop me an e-mail.