Fermat
The Path Tracer (Revisited)

Top: Contents


This chapter aims to show how to use PTLib library to build a much more streamlined, yet even more powerful path tracer.
If you take look at the full implementation in pathtracer_impl.h, you should notice that the overall structure is pretty similar to that of our Hello World prototype path-tracer, yet even more compact. We will skip some details and go directly to the important bits. Remember that the central feature of PTLib is the path_trace_loop() function, and that this is configured by a few template classes that need to be provided by the user. The first here is the implementation of the TPTContext class:
// the internal path tracing context
//
template <typename TDirectLightingSampler>
struct PathTracingContext : PTContextBase<PTOptions>, PTContextQueues
{
TDirectLightingSampler dl;
};
The main news here is the fact we inherited it from PTContextBase and PTContextQueues, and added a single member, the templated TPTDirectLightingSampler, PathTracingContext::dl.
The render method itself starts almost identically to the one we already saw:
// pre-multiply the previous frame for blending
renderer.rescale_frame( instance );
//fprintf(stderr, "render started (%u)\n", instance);
const uint2 res = renderer.res();
const uint32 n_pixels = res.x * res.y;
cugar::memory_arena arena( m_memory_pool.ptr() );
PTRayQueue in_queue;
PTRayQueue scatter_queue;
PTRayQueue shadow_queue;
m_options,
n_pixels,
in_queue,
scatter_queue,
shadow_queue,
arena );
// fetch a view of the renderer
RenderingContextView renderer_view = renderer.view(instance);
// instantiate the vertex processor
PTVertexProcessor vertex_processor;
The main news should be the last two lines:
// instantiate the vertex processor
PTVertexProcessor vertex_processor;
i.e. the instantiation of a custom TPTVertexProcessor - the second template interface that must be implemented in order to configure PTLib. After that, the body of the render method is almost trivial:
// use the Reinforcement-Learning direct-lighting sampler
if (m_options.nee_type == NEE_ALGORITHM_RL)
{
// initialize the path-tracing context
PathTracingContext<DirectLightingRL> context;
context.options = m_options;
context.in_bounce = 0;
context.in_queue = in_queue;
context.scatter_queue = scatter_queue;
context.shadow_queue = shadow_queue;
context.sequence = m_sequence.view();
context.frame_weight = 1.0f / float(renderer_view.instance + 1);
context.device_timers = device_timers;
context.bbox = m_bbox;
context.dl = DirectLightingRL(
view( *m_vtls_rl ),
m_mesh_vtls->view() );
// instantiate the actual path tracing loop
path_trace_loop( context, vertex_processor, renderer, renderer_view, stats );
}
else // use the regular mesh emitter direct-lighting sampler
{
// select which instantiation of the mesh light to use (VPLs or the plain mesh)
MeshLight mesh_light = m_options.nee_type == NEE_ALGORITHM_VPL ? renderer_view.mesh_vpls : renderer_view.mesh_light;
// initialize the path-tracing context
PathTracingContext<DirectLightingMesh> context;
context.options = m_options;
context.in_bounce = 0;
context.in_queue = in_queue;
context.scatter_queue = scatter_queue;
context.shadow_queue = shadow_queue;
context.sequence = m_sequence.view();
context.frame_weight = 1.0f / float(renderer_view.instance + 1);
context.device_timers = device_timers;
context.bbox = m_bbox;
context.dl = DirectLightingMesh( mesh_light );
// instantiate the actual path tracing loop
path_trace_loop( context, vertex_processor, renderer, renderer_view, stats );
}
You'll notice that there are two parts to it, with two different instantiations of the path_trace_loop() call: they correspond to using two different direct-lighting samplers provided by Fermat, the default mesh-based sampler (DirectLightingMesh, second branch), which can be configured to either generate samples on the mesh triangles on the fly, or use a set of pre-sampled VPLs, and a more advanced Reinforcement Learning based sampler (DirectLightingRL, first branch). Other than that, the two branches are almost identical: they initialize the context with all its members, and call into path_trace_loop().
So what does our vertex processor do, exactly? It implements a few methods specifying how to weight and accumulate the path samples generated by path_trace_loop(). The first such method handles the calculation of Next-Event Estimation (NEE) weights:
template <typename PTContext>
FERMAT_DEVICE
void compute_nee_weights(
const PTContext& context, // the current context
const RenderingContextView& renderer, // the current renderer
const PixelInfo pixel_info, // packed pixel info
const uint32 prev_vertex_info, // packed previous vertex info
const uint32 vertex_info, // packed vertex info
const EyeVertex& ev, // the local vertex
const cugar::Vector3f& f_d, // the diffuse BSDF weight
const cugar::Vector3f& f_g, // the glossy BSDF weight
const cugar::Vector3f& w, // the current path weight
const cugar::Vector3f& f_L, // the current light EDF sample contribution
cugar::Vector3f& out_w_d, // the output diffuse weight
cugar::Vector3f& out_w_g, // the output glossy weight
uint32& out_vertex_info) // the output packed vertex info
{
out_w_d = (context.in_bounce == 0 ? f_d : f_d + f_g) * w * f_L;
out_w_g = (context.in_bounce == 0 ? f_g : f_d + f_g) * w * f_L;
out_vertex_info = 0xFFFFFFFF; // mark this unused
}
The relevant lines are just these:
out_w_d = (context.in_bounce == 0 ? f_d : f_d + f_g) * w * f_L;
out_w_g = (context.in_bounce == 0 ? f_g : f_d + f_g) * w * f_L;
out_vertex_info = 0xFFFFFFFF; // mark this unused
For now we'll focus on the first two, and ignore the third and last line; we'll come back to that later. The first says that output diffuse weight, out_w_d should be equal to the local diffuse BSDF component f_d, times the light's EDF f_L, times the path weight w if this is the very first vertex along a path (i.e. the one directly visible from the camera), and otherwise it should be the sum of both the diffuse and glossy components (f_d + f_g), times the EDF and the path weight. The rationale for this distinction is that this path tracer supports splitting the output to distinct framebuffer channels for diffuse and specular components, and this output diffuse weight is what will be eventually accumulated into the diffuse channel, and we want it to contain the diffuse direct lighting at the first bounce, plus all the indirect lighting (whether glossy or diffuse) seen through any previous diffuse scattering event. The output glossy weight out_w_g is computed with similar logic.
At this point you might wonder how we're going to find out whether this sample did go through some previous scattering event, and we will find the answer examining the next method, specifying the recipe for the actual sample accumulation:
template <typename PTContext>
FERMAT_DEVICE
void accumulate_nee(
const PTContext& context, // the current context
RenderingContextView& renderer, // the current renderer
const PixelInfo pixel_info, // packed pixel info
const uint32 vertex_info, // packed vertex info
const bool shadow_hit, // the hit information
const cugar::Vector3f& w_d, // the diffuse NEE weight
const cugar::Vector3f& w_g) // the glossy NEE weight
{
if (shadow_hit == false)
{
FBufferView& fb = renderer.fb;
FBufferChannelView& composited_channel = fb(FBufferDesc::COMPOSITED_C);
FBufferChannelView& diffuse_channel = fb(FBufferDesc::DIFFUSE_C);
FBufferChannelView& specular_channel = fb(FBufferDesc::SPECULAR_C);
const uint32 pixel_index = pixel_info.pixel;
const uint32 pixel_comp = pixel_info.comp;
const float frame_weight = context.frame_weight;
add_in<false>( composited_channel, pixel_index, w_d + w_g, frame_weight );
if (context.in_bounce == 0)
{
// accumulate the per-component values to the respective output channels
add_in<true>( diffuse_channel, pixel_index, w_d, frame_weight );
add_in<true>( specular_channel, pixel_index, w_g, frame_weight );
}
else
{
// accumulate the per-component value to the proper output channel
if (pixel_comp & Bsdf::kDiffuseMask) add_in<true>( diffuse_channel, pixel_index, w_d, frame_weight );
if (pixel_comp & Bsdf::kGlossyMask) add_in<true>( specular_channel, pixel_index, w_g, frame_weight );
}
}
}
Here you can notice that, again, we have some special casing for the first bounce - which adds both the diffuse and the glossy sample values to their respective channels - and another case for all other bounces - which selectively write to either the diffuse or the glossy channel based on the pixel_info.comp field, which is exactly what PTLib uses to mark whether a path is diffuse or glossy (or in other words, whether the first scattering event as seen from the camera was diffuse or glossy).
The next method we should look at is the one specifying the assignment of the new path weight after a scattering event:
template <typename PTContext>
FERMAT_DEVICE
void compute_scattering_weights(
const PTContext& context, // the current context
const RenderingContextView& renderer, // the current renderer
const PixelInfo pixel_info, // packed pixel info
const uint32 prev_vertex_info, // packed previous vertex info
const uint32 vertex_info, // packed vertex info
const EyeVertex& ev, // the local vertex
const uint32 out_comp, // the bsdf scattering component
const cugar::Vector3f& g, // the bsdf scattering weight (= f/p)
const cugar::Vector3f& w, // the current path weight
cugar::Vector3f& out_w, // the output path weight
uint32& out_vertex_info) // the output vertex info
{
out_w = g * w;
out_vertex_info = 0xFFFFFFFF; // mark this unused
}
and again, besides the function signature boilerplate, this method is very simple: it says that the output path weight out_w should be the product of the local BSDF scattering weight g (which is calculated as the BSDF value divided by the sampling pdf, i.e. g = f/p), times the current path weight w.
Finally, we'll look at the recipe for accumulating emissive vertices found along a path:
template <typename PTContext>
FERMAT_DEVICE
void accumulate_emissive(
const PTContext& context, // the current context
RenderingContextView& renderer, // the current renderer
const PixelInfo pixel_info, // packed pixel info
const uint32 prev_vertex_info, // packed previous vertex info
const uint32 vertex_info, // packed vertex info
const EyeVertex& ev, // the local vertex
const cugar::Vector3f& w) // the current path weight
{
FBufferView& fb = renderer.fb;
FBufferChannelView& composited_channel = fb(FBufferDesc::COMPOSITED_C);
FBufferChannelView& direct_channel = fb(FBufferDesc::DIRECT_C);
FBufferChannelView& diffuse_channel = fb(FBufferDesc::DIFFUSE_C);
FBufferChannelView& specular_channel = fb(FBufferDesc::SPECULAR_C);
const uint32 pixel_index = pixel_info.pixel;
const uint32 pixel_comp = pixel_info.comp;
const float frame_weight = context.frame_weight;
// accumulate to the image
add_in<false>(composited_channel, pixel_index, w, frame_weight);
// accumulate the per-component value to the proper output channel
if (context.in_bounce == 0)
add_in<false>(direct_channel, pixel_index, w, frame_weight);
else
{
if (pixel_comp & Bsdf::kDiffuseMask) add_in<true>(diffuse_channel, pixel_index, w, frame_weight);
if (pixel_comp & Bsdf::kGlossyMask) add_in<true>(specular_channel, pixel_index, w, frame_weight);
}
}
and hopefully at this point you sort of understand what's going on: the emissive sample w is simply accumulated to the various framebuffer channels it contributes to, again depending on whether this is the first bounce (in which case the sample represents direct lighting), or a secondary one.
This should more or less clarify how to use PTLib. More importantly, it should clarify what it is designed for: implementing massively parallel, customizable path tracers, without actually writing any of the relatively complex kernel and queueing logic necessary to implement them. In the next chapter, we'll see a more advanced use case, where the same exact library is customized to perform path space filtering. Incidentally, so far we didn't quite explain what were all those packed vertex_info's that we blatantly initialized to 0xFFFFFFFF (essentially marking them unused): the next section will show what they can be used for.

Next: The Path-Space Filtering Path Tracer