Friday, April 28, 2017

Cycles, the fastest GPU renderer thanks to new denoising algorithms

Cycles is Blender's native CPU/GPU renderer, originally created in early 2011 by Brecht van Lommel (who left the Blender Institute in 2014 to work on Solid Angle's Arnold, which was acquired last year by the innovation crushing Autodesk Corp.). In the past six years, it has slowly but steadily become a fully featured production ready renderer including motion blur, hair/fur rendering, OpenVDB volume rendering, Disney's OpenSubDiv and Principled PBR shader, GGX microfacet distribution, AOVs (arbitrary output volumes or render passes), filmic tonemapping and support for Alembic scene importing.

A video showing the stunning realism that can be achieved with Cycles:

Even though Cycles has been open source since the beginning, the Blender Institute decided in August 2013 to change the license for the Cycles source code from a restrictive GPL license to a permissive Apache 2.0 license, which allows Cycles to be integrated into commercial projects.

Although Cycles started out as an unbiased renderer, it quickly adopted many biased tricks to drastically cut down rendertimes such as clamping the bounces for different types of rays, blurry filters for glossy surfaces and switching over to shooting ambient occlusion rays after a certain number of bounces is reached.  

In recent months, Lukas Stockner, one of Cycles' developers (who was also responsible for adding light portals and IES light profile support) implemented a few remarkable noise reduction algorithms based on very recent research, which will certainly turn many rendering heads. Two features in particular have been added that reduce rendertimes by 8 times on average: scramble distance (which takes the randomness out of sampling and traces rays in a fully coherent way) and a noise filtering algorithm based on "weigthed local regression". The noise filter has been in development for over a year and has been available in experimental Cycles builds for beta-testing. It's currently under final review and is ready to be released into the Cycles master branch any day. The Blender community is going wild and for good reason. The new denoiser delivers exceptional results, preserving details in textures at very low sample rates and rendertimes:

Full HD render (1920x1080 resolution). Rendertime: 1m 24s
Fully denoised at 50 samples on a single GTX 1070.
Image from the Blender Artists forum
Final denoised and colour corrected render, 1m25s (from BlenderArtists forum)
Some of my own tests using one GPU:

20 samples, no denoising, render time 3m28s

20 samples, denoised, render time 4m09s

200 samples, no denoising, render time 31m58s

The new version of Cycles with built-in denoising will run on both CPU and GPUs from Nvidia and AMD. Experimental builds for CUDA and OpenCL are available here.

Experimental OpenCL/CUDA build Release notes:
  • OpenCL & Cuda GPU Denoise System (this is Lukas' latest denoise code system) 
  • Cuda & OpenCL supported
  • GPU Denoise Multi-GPU Support (even in viewport, definitely works for Cuda but not tested with multiple OpenCL GPUs)
  • Scramble Distance added for Sobol and multi-jitter (works on CPU & GPU) Also added to supported features render tab
  • Blue Noise Dithered Sobol with scramble distance
  • Thread Divergence Sort Reduction patch (gives 30% speedup in classroom and 8% in Barcelona scene)
More information on the denoising algorithm can be found in this thread on the Blender Artists forum and Lukas Stockner's Wiki page:

Experimental Cycles denoising build thread

With this groundbreaking denoiser, Cycles leapfrogs all other GPU renderers, and will soon be making the dream of ultrafast photoreal rendering happen for anyone.  


asif ali said...

Hey Sam,
I've been reading your tutorials on GPU path tracing and i'm new to all this but your tutorials have been incredibly helpful for me to understand the concept. I kinda have a question about a video I've come across
I assumed that path tracing would generate a noisy image which would gradually converge over time. In the video the rendering seems patchy but I see no fireflies or noise. Is there a different technique to this? I thought i'd ask here since google searches haven't led me to a similar method so far.

Sam Lapere said...

Hi Asif,

I'm not sure how Magica Voxel does its rendering exactly (the code is on Github), but it looks like it's using some form of ray bundles for calculating global illumination, similar to (there's a video if you scroll all the way down)

Allan Amaral said...

Hey Sam,
Thanks for the blog, it helped a lot while I was beggining in path tracing!
Do you have any paper or reference on the scramble distance sampling method?
Thanks in advance.

Jacco Bikker said...

I doubt that Magica Voxel is using radiosity (like Thomas does); looking at the footage I would guess that it does pure path tracing, but the random directions over the hemisphere are the same for all pixels for each pass. This way you don't get noise, but the directions are sampled one by one, which causes the 'jumping' of the light.

Sam Lapere said...

Allan, I would guess the scramble distance feature in Cycles is similar to the coherent path tracing technique in this paper:[Sadeghi%20et%20al.%202009]%20Coherent%20Path%20Tracing.pdf

Jacco: thanks for your input. That also sounds very similar to coherent path tracing to me.

For previews, I prefer noise from standard path tracing over artifacts created by coherent pt. Using the coherent technique also wouldn't work well for real-time path tracing.

asif ali said...

I've experimented with what Jacco suggested (using the same direction for each pass) and it looks pretty much similar to what magica voxel does.
Here is a video of it:
I guess if I divided the hemisphere and sequentially chose a region each frame to send out rays it would look somewhat better. Thanks a lot Sam and Jacco for helping me out with this :)

Adam James said...

Sam, the scramble distance feature seems based on a SIGGRAPH talk presented last year. I remember a discussion on the Blender Cycles mailing list: Bf-cycles Cache-Friendly Micro-Jittered Sampling

The paper is here sig2016_cacheFriendlyMicroJitteredSampling.pdf and comes with a video: youtube video

As far as I understand, it is similar to coherent pt but without the artifacts.

Sam Lapere said...

Interesting. Thanks for the links!