Saturday, August 2, 2008

Voxel ray tracing vs polygon ray tracing

Carmack's thoughts about ray tracing:

I think that ray tracing in the classical sense, of analytically intersecting rays with conventionally defined geometry, whether they be triangle meshes or higher order primitives, I’m not really bullish on that taking over for primary rendering tasks which is essentially what Intel is pushing. But, I do think that there is a very strong possibility as we move towards next generation technologies for a ray tracing architecture that uses a specific data structure, rather than just taking triangles like everybody uses and tracing rays against them and being really, really expensive. It
involves ray tracing into a sparse voxel octree which is essentially a geometric evolution of the mega-texture technologies that we’re doing today for uniquely texturing entire worlds. It’s clear that what we want to do in the following generation is have unique geometry down to the equivalent of the texel across everything.

There are some interesting things to note in there:

- ray tracing in the classical sense, in which rays intersect with triangles, is far too expensive for use in games, even with next generation hardware

- the sparse voxel octree format permits unique geometry

Octrees can be used to accelerate ray tracing and store geometry in a compressed format at the same time.

Quote from a game developer (Rare) on the voxel octree:

Storing data in an octree is far more efficient than storing it using textures and polygons (it's basically free compression for both geometry and texture data). It's primarily cool because you stop traversing when the size of the pixel is larger than the projected cell, so you don't even need to have all your data in memory, but can stream it in on demand. This means that the amount of data truly is unlimited, or at least the limits are with the artists producing it. You only need a fixed amount of voxels loaded to view a scene, and that doesn't change regardless of how big the scene is. The number of voxels required is proportional to the number of pixels on the screen. This is true regardless of how much data you're rendering! This is not true for rasterization unless you have some magical per-pixel visibility and LOD scheme to cut down the number of pixels and vertices to process, which is impossible to achieve in practice. Plus ray casting automatically gives you exact information on what geometry needs to be loaded in from disk, so it's a "perfect" streaming system,
wheras with rasterization it would be very difficult to incrementally load a scene depending on what's visible (because you need to load the scene before you know what's visible!)
If you want to model micrometer detail, go ahead, it won't be loaded into memory until someone zooms in close enough to see it. Voxels that are not intersected can be thrown out of memory. Of course you would keep some sort of cache and throw things out on a least recently used basis, but since it's hierarchical you can just load in new levels in the hierarchy only when you hit them.

Voxels have some very interesting benefits compared to polygons:

- It's a volumetric representation, so you can model very fine details and bumps, without the need for bump mapping. Particle effects like smoke, fire and foam can be efficiently rendered without using hacks. Voxels are also being used by some big Hollywood special effects studio's to render hair, fur and grass.

- id wants to use voxels to render everything static with real geometry without using normal maps.

- Voxels can store a color and a normal. For the renderer, textures and geometry are essentially the same.

- The position of the voxel is defined implicitely by the structure that is holding it (the octree). Here's the good part: this structure represents both the primitives that need to be intersected and the spatial division of these primitives. So, in contrast to triangle ray tracing which needs a separate spatial division structure (kd-tree, BVH, ...), voxels are right away structured in a grid or an octree (this does not mean that other structures can't be used as well). So for voxel ray tracing, octrees are perfect.

- Voxels are very cheap primitives to intersect, much cheaper than triangles. This is probably their biggest benefit when choosing between voxel and polygon ray tracing.

- A voxel octree permits a very natural multiresolution. There's no need to go deeper into the octree when the size of a pixel is larger than the underlying cell, so you don't have to display detail if it's not necessary and you don't streal in data that isn't visible either way.

- Voxels are extremely well suited for local effects (voxel ray casting). In contrast to triangle rasterization, there are no problems with transparency, refraction, ... There are also major benefits artwise: because voxels are volumetric, you can achieve effects like erosion, aging materials, wear and tear by simply changing the iso value.

- Ray casting voxels is much less sensitive to scene complexity than triangles

(partly translated from

Disadvantages of voxel ray tracing vs polygon ray tracing:

- Memory. Voxel data sets are huge relative to polygon data. But this doesn't have to be a problem, since all data can be streamed in. This does however create new challenges when the point of view changes rapidly and a lot of new data bricks have to be streamed in at once. Voxels sets have the benefit over polygons that voxel subsets can be loaded in, which permits some sort of progressive refinement. Other possible solutions are: using faster hard disks or solid state drives to accelerate the streaming, limiting depth traversal during fast camera movement or masking the streaming with motion blur or depth of field postprocessing.

- Animation of voxels requires specialized tools

- Disadvantages of ray tracing in general: dynamic objects require the octree to be updated in realtime. However, there are solutions for dynamic objects which don't require updating of the octree (such as building a deformation lattice around dynamic objects so that when you raycast into it bend the rays as it hits the deformation lattice). id Tech 6 plans to tackle the problem of having many dynamic objects with hybrid rendering.

More on dynamic raytracing:

Dynamic Acceleration Structures for Interactive Ray Tracing, Reinhard, E., Smits, B., and Hansen, C., in Proc. Eurographics Workshop on Rendering, pp. 299-306, June 2000. Summary: This system uses a grid data structure, allowing dynamic objects to be easily inserted or removed. The grid is tiled in space (i.e. it wraps around) to avoid problems with fixed boundaries. They also implement a hiearchical grid with data in both internal and leaf nodes; objects are inserted into the optimal level.

Towards Rapid Reconstruction for Animated Ray Tracing, Lext and Akenine-Moller, Eurographics 2001. Summary: Each rigid dynamic object gets its own grid acceleration structure, and rays are transformed into this local coordinate system. Surprisingly, they show that this scheme is not a big win for simple scenes, because in simple scenes it is possible to completely rebuild the grid each frame using only about a quarter of the runtime. But, this would probably not be true for a k-d or BSP tree.

Distributed Interactive Ray Tracing of Dynamic Scenes, Wald, Benthin, and Slusallek, Proc. IEEE Symp. on Parallel and Large-Data Visualization and Graphics (PVG), 2003. Summary: This system uses ray transformation (into object coordinate system) for rigid movement, and BSP rebuild for unstructured movement. A top-level BSP tree is rebuilt every frame to hold bounding volumes for the moving objects. Performance is still an issue for unstructured movement.

Interactive Space Deformation with Hardware Assisted Rendering, IEEE Computer Graphics and Applications, Vol 17, no 6, 1997, pp. 66-77. Summary: Instead of deforming objects directly, this system deforms the space in which they reside (using 1-to-1 deformations). During raytracing, the rays are deformed into the object space instead of deforming the objects into the ray space. However, the resulting deformed rays are no longer straight, so they must be discretized into short line segments to perform the actual ray-object intersection tests.

Ray casting free-form deformed-volume objects, Haixin Chen, Jürgen Hesser, Reinhard Männer A collection of techniques is developed in this paper for ray casting free-form deformed-volume objects with high quality and efficiency. The known inverse ray deformation approach is combined with free-form deformation to bend the rays to the opposite direction of the deformation, producing an image of the deformed volume without generating a really deformed intermediate volume. The local curvature is estimated and used for the adaptive selection of the length of polyline segments, which approximate the inversely deformed ray trajectories; thus longer polyline segments can be automatically selected in regions with small curvature, reducing deformation calculation without losing the spatial continuity of the simulated deformation. We developed an efficient method for the estimation of the local deformation function. The Jacobian of the local deformation function is used for adjustments of the opacity values and normal vectors computed from the original volume, guaranteeing that the deformed spatial structures are correctly rendered. The popular ray casting acceleration techniques, like early ray termination and space leaping, are incorporated into the deformation procedure, providing a speed-up factor of 2.34-6.56 compared to the non-optimized case.

More info on id Tech 6 and voxel ray casting in the ompf thread

Carmack, id Tech 6, hybrid rendering

In his QuakeCon keynote, John Carmack explained that his next generation engine id Tech 6 would still be mainly a hybrid renderer:

I can say with conviction at this point that the next generation games are still going to be predominantly polygon games. Even what we're looking at for id Tech 6 with all of this infinite geometry, voxelising everything, probably recursive automatic geometry generation - all of this is still going to be a hybrid approach. We hope that we can generate these incredible lush environments on there, but the characters are probably still going to be coming in as triangles over a skeleton there. There will probably be some interesting things tried with completely non-polygonal renderers, but the practical approach with games that look like the games we're doing now, but play better, probably will still have lots of polygons going on and these chips better be really good at that.

Full voxel based games with many dynamic objects and characters are still too demanding for next generation technology. On the other hand, a scene with few dynamic objects (such as the Ruby demo) should be entirely possible with voxels only.

Friday, August 1, 2008

id, Voxels and Ray Tracing

According to this article, the full Ruby demo will be shown to the public at Siggraph 2008.

My interest in this demo is, apart from the photorealistic quality, based on two facts: the GPU ray tracing and the voxel based rendering. Never before have I seen a raytraced (CPU or GPU) scene of this scope and quality in realtime. Urbach has stated in the video's that his raytracing algorithm is not 100% fully accurate, but nevertheless I think it looks absolutely amazing.
At Siggraph 2008, there will be a panel discussion on realtime ray tracing, where Jules Urbach will be the special guest. Hopefully, there will be more info on the ray tracing part then.

On to the voxels...
In March of this year, John Carmack stated in an interview that he was investigating a new rendering technique for his next generation engine (id Tech 6), which involves raycasting into a sparse voxel octree. This has spurred renewed interest in voxel rendering and parallels with the new Ruby demo are quickly drawn.

Today's GPU are already blazingly fast when it comes to polygon rendering and don't break a sweat in the multimillion triangle scenes of Crysis. So there must be a good reason why some developers are spending time and energy on voxel rendering. John Carmack explains it like this in the interview:

It’s interesting that if you look at representing this data in this particular sparse voxel octree format it winds up even being a more efficient way to store the 2D data as well as the 3D geometry data, because you don’t have packing and bordering issues. So we have incredibly high numbers; billions of triangles of data that you store in a very efficient manner. Now what is different about this versus a conventional ray tracing architecture is that it is a specialized data structure that you can ray trace into quite efficiently and that data structure brings you some significant benefits that you wouldn’t get from a triangular structure. It would be 50 or 100 times more data if you stored it out in a triangular mesh, which you couldn’t actually do in practice.

Jon Olick, programmer at id Software, provided some interesting details about the sparse voxel octree raycasting in this ompf thread. He will also give a talk on the subject at Siggraph.

In the ompf thread, there are also a number of interesting links to research papers about voxel octree raycasting:

A single-pass GPU ray casting framework for interactive out-of-core rendering of massive volumetric datasets Enrico Gobbetti, Fabio Marton, and José Antonio Iglesias Guitián 2008

Interactive Gigavoxels, Cyril Crassin, Fabrice Neyret, Sylvain Lefebvre 2008

Ray tracing into voxel compressed into an octree

The octree texture Sylvain Lefebvre

The difference between id Tech 6 and Otoy is the way the voxels are rendered: id's sparse voxel octree tech is about voxel ray casting (primary rays only), while Otoy does voxel raytracing, which allows for raytraced reflections and possibly even raytraced shadows and photon mapping.

Otoy, Transformers and Ray Tracing

After the Ruby/LightStage demo, 4 other video's appeared as part of an article about Otoy on TechCrunch. Urbach explains that he started experimenting with Renderman code on graphics hardware during the making of Cars in 2005. This work caught the interest from ILM, who gave Urbach the models from the Transformer movie to render in realtime. Urbach and his team made 4 commercials for the Transformer movie that were rendered and directed in realtime on graphics hardware. Afterwards, he was contacted by Sony to work on the Spiderman movie.

The 4 video's:

Video 1 OTOY Demo

This video shows short clips of realtime rendered Transformer sequences

Video 2 Jules Urbach explains OTOY's real-time graphics rendering

In this video, Urbach talks about his experiments with GPU ray tracing in 2005, the Transformers trailers and the voxel raytracing for the Ruby demo. For the tests with Cars in 2005, he was able to do "realtime raytraced reflections with up to 20 bounces of light". He also implemented some realtime global illumination technique. For the new Ruby demo, he is actually "raytracing the entire scene", and "not using the vertex pipeline anymore". Thanks to the voxel rendering "the level of detail becomes infinite".

Video 3 OTOY Graphics Rendered in the Browser

This video shows the server side rendering capabilities of Otoy. It shows Urbach interacting with scenes from the Transformers trailers, that are being rendered in realtime on his GPU servers and streamed over the net into the browser.

Urbach mentions "raytraced reflections on the windows". When he switches to nighttime, he says "in this particular demo, there's no baked lighting, nothing is precomputed", there are "hundreds of lights in the building rendered in realtime".

The demo runs on three graphics cards (3x RV770): one card renders the ILM Optimus, second card renders the G1 Optimus Prime and the third card renders the city and the raytraced reflections on the windows.

Video 4 Jules Urbach of OTOY Explains LightStage

Video about LightStage, slightly more elaborate than this one.

There is also a video of the full AMD Cinema 2.0 event in which Urbach talks a bit about ray tracing on GPU's (from 41:00 to 47:00) and goes a bit more in-depth during the Q&A session (from 72:00 to 88:00):

- Urbach has been talking to game publishers to start integrating the relighting part of Otoy in existing game engines

- Otoy can do full raytracing, but also supports hybrid rendering. It can convert any polygonal mesh to voxels

- The Ruby demo does not use any polygons, only voxels

- For games, Urbach thinks hybrid rendering will be the way to go "for a very long time"

- With this technology, game developers will require a different way of working. Basically they're saying that you can make a photorealistic game, but the workload on the artist side will be astronomous

- In 2005, Urbach started out writing approximations to Renderman code during the making of Cars. At the time, he used cheats for ray tracing and reflections. In three years, GPU’s have evolved so quickly that the latest hardware makes realtime ray tracing possible that is “99 % accurate”

- Voxel data sets are huge, but with voxel based rendering you can load only subsets of the voxel space, which is not possible with polygons. You can also choose which texture layers to load

- Compression and decompression of the voxel data is CPU bound. What takes 3 seconds to decompress on a CPU, can be done at a “thousand frames per second” on a GPU.

- What's interesting according to Urbach is that in 2005 he started out writing approximations to ray tracing, but the latest generation of hardware allows him to do ray tracing that gets really close to the 100% point

Urbach also showed another Otoy demo at the AMD event, called Bug Snuff. It shows a photorealistic scene with a scorpion, rendered in realtime and directed by David Fincher. Really impressive stuff!

Lastly, the ompf thread where it all started:

Thanks to all the ompf members and guests who participated and contributed to the thread.

The Ruby demo continued

One week after the unveiling of the Ruby teaser, Jules Urbach recorded a new video, in which he gave a bit more info on the voxel based rendering that was used in the Ruby demo and announced his other company, LightStage LLC. LightStage is a technology used by Hollywood film studio's to capture lifelike 3D representations of actors, which can be relighted afterwards. It has been used in Spiderman 3 and will also be used to motion capture and digitalize the actress playing Ruby. Her digital facsimile will then replace the Ruby character from the teaser.


Full transcript:

Hi, I’m Jules Urbach and this is a follow-up to the presentation we did a week ago at Cinema 2.0, the launch for the 770. What we are showing is a couple of new things that we weren’t able to talk about last week, that I think are really interesting. So we’re announcing today that we are developing LightStage. LightStage LLC is a separate company for capturing and rendering really high quality 3d characters and LightStage is an interesting technology that was developed at USC by Paul Debevec and Tim Hawkins, and it solves the problem of the uncanny valley as far as characters go. So we’re very pleased we’re able to announce that we can actually take this data and start working with it and applying it in our projects.

So, if you take a look at what we were doing years ago, to do characters and animation it was limited by the fact that our artists can only create so much detail in a head or a human form. And this is the normal map for that head and this is a really complex skin shader that we wrote to try to recreate what humans look like. And in a lot of ways both these problems go away with the LightStage.

The LightStage first of all can capture a real person. So, to generate this head, an artist doesn’t even sit there and sculpt it. We can essentially put a woman in the LightStage, which is a domed capture environment, and it captures all the surfaces and all the normals for it and all the details, including the hair. And it captures it in full motion, so you have to understand that, unlike traditional motion capture where there is either make-up applied to the face or dots put on the face, we just put somebody in the LightStage and they can do their lines and speak and it does full motion capture optically.

So it gets the full data set of essentially all the points in their face. And this is in fact the rendered version of LightStage. This is all the data that is captured on the LightStage accumulated. It’s not a photograph. This is a fully relit head, based on the model you just saw. And it’s obviously a lot of data, but the work that we announced last week with the GPU, compression / decompression, we’re going to apply that to LightStage (and have) data sets that we can start loading in real-time and rendering them, not just for film work but in games as well.

So the LightStage data, I think it really closes the uncanny valley. I mean particularly this kind of data where we have all the relighting information, stored for every single pixel, it’s exciting and it gives us really high quality characters that I think look completely real. And that’s, I think one of the things we showed last week at the Cinema 2.0 event was that we could do scenes and cities and things that look really good and this essentially gives us people. And it goes even further than that, but we will certainly be announcing more as we get further along working on LightStage.

I’m gonna show one more thing that we didn’t get a chance to really show last week as well, which is some of the real-time stuff, the voxel rendering. We basically had two separate demos, one for just showing the fact that we can look around the
voxelized scene and render it. And this one now, this demo, is a slight update of the original one, where I’m able to actually look around and essentially place voxels in
the scene, but also relight it as well.

So this is part two of three that we’re releasing. This shows essentially us going through the scene, selecting an object and either rendering it through the full lighting pipeline or just doing the global illumination pass. Then we can also just use the normals to do full, totally new novel reflections on. But you can essentially see that the voxels, even in this fairly low resolution form, can capture lighting information and capture all these different details. And we can do that as we’re navigating through the scene.

So this is sort of part two of our Ruby real-time demo and part three of it is gonna show, as a next step, the full navigation through the voxel scenes. And that’s gonna be dependent on the compression we’re developing. Because right now, the reason why we are not loading the entire animation is that the frame data is about 700 megabytes for every frame. We can easily compress that down to 1/100th the size, we're looking to do about a 1000th the size. And then with that we’ll be able to load much larger voxel data sets and actually have you navigating pretty far throughout the scene and still keep the ray tracing and voxelization good enough that you don’t really see any sort of pixelized or voxelized data sets too closely.

So one more demo, that’s worth showing I think, is related to the LightStage. You can actually see that on the right here. And that basically is really a mesh that is generated from one person in LightStage. It doesn’t have all the LightStage data in there, but what you can see from this example is one reason why voxel rendering may be important. So this is really using a very simple polygonal mesh, evenso it’s about 32 million triangles, just to render the scene. So I’m gonna show the wireframe of it and you can see that the data is so dense, everything from the eyes to the eyelashes are all there, and we’re only really using a small subset of the point cloud that’s generated from the LightStage. So if we move to voxel rendering, which I’m planning to do for LightStage as soon as we’re done with the Ruby demo, we’ll be able to have voxelized assets rendering in realtime at mùch higher resolutions than this. And that’s gonna be giving us characters that look better than anything we can show in any of these videos. And we should have that ready probably before the end of the year, so it’s exciting stuff! Thank you for watching, hope you enjoyed the

The Ruby demo

To begin with... a picture says more than a thousand words

This is a picture from the Ruby demo. It looks photorealistic, but is realtime and interactive.

A high quality video (720p) of the animation can be seen here

Low quality video here

The Ruby demo was made by a company named Otoy. Jules Urbach, founder and CEO of Otoy, has given some info on the technology behind the Ruby demo in several video's on the net:

Video 1:

In this video, Urbach says that the Ruby demo is rendered with voxel ray tracing. Otoy can also dynamically relight the scene.

The full transcript of the video:

Rick Bergman: So Jules, this is his creation. He's done a fantastic job with it. You're probably also thinking, well this is a video. He's gonna step you through and actually talk you some of the key features of this demo.

Jules Urbach: Thank you, Rick. What you're seeing here, is a frame from the animation you just saw, done Cinema 2.0 style. So the first thing you'll notice is that this isn't really just a video, we can look around, we can see the set that we've built, in fact it is, it's a set, you can see it's really ... When we first showed the clips of
what we were doing with this, some people thought the street scene was film, and it's not, it's a completely computer generated scene, created by our art team. And you can see here, this is the relighting portion of the rendering pipeline,this is really just a very early teaser, a preview of what we're doing with this Ruby demo.

So you're seeing only the second half of the Cinema 2.0 rendering pipeline, the
relighting portion of it. I can drag the cursor over any object and I can sort of see the different layers that go into making it whether it's global illumination, photon maps, diffuse lighting or in this case, complete control over the scene and the reflections.

And this is a really novel way of rendering graphics, we're not using any polygons. And the thing that makes this very different from just a simple relighting demo, is that every single pixel you're seeing in the scene has depth and it's essentially renderable as voxels.

We also have the capability of controlling every aspect of the exposure in the lighting pipeline, adding glares and glints to our satisfaction. And that makes a big impact in the rendering of any scene that we're doing.

So one of the things that is key to doing voxel based rendering is ray tracing, which I spoke about earlier, and the other element is compression, because these data sets are enormous. One of the things that's very exciting about the latest generation of hardware, coming from AMD, is that we can now write general purpose code, using CAL, that does wavelet compression. So we're able to compress these data sets, which are pretty massive, down to very reasonable components. And we think that we can stream those down and essentially give people, who have ever seen a video stream of that animation, essentially a fully navigable, relightable, completely interactive scene and that's the ... of 2.0 and we're very excited to be able to be part of that technology and that processing and bringing that to fruition.

The first time I saw this video, the words "ray tracing" and "voxels" immediately grabbed my attention. So I did some further research...

Ruby, voxels and ray tracing


My name is Ray Tracey. I'm a graphics enthusiast with a passion for lifelike interactive graphics.

Recently, I was struck by a realtime demo of a photorealistic city scene that was shown at an AMD/ATI event. It looked like something that came straight out of my imagination. For years I have been wondering when interactive graphics would reach this level of quality. I was interested to say the least.

A week later, AMD revealed that this city scene was part of their new Ruby demo for the Radeon 4800 cards: they showed the same scene, but this time there was moving traffic and people. Ruby appeared, running for her life from a giant killer machine. Once again, I was in awe: it looked unbelievable, I didn't expect to see such graphics within the next three years on consumer hardware. So after I witnessed this graphical marvel, I decided to find out more about it. After several hours of Google'ing, I had come to the disheartening conclusion that there was almost no info on the demo to be found, except for some lousy quality youtube video's and a small PR paragraph on the AMD site. But I searched further... The small snippets of info that I did find, stimulated me to write this blog as a "resource" that bundles all the publicly available information on this technology and as a means for better understanding of the tech to anyone interested.