Top: Contents

PTLib is a flexible path tracing library, thought to be as performant as possible and yet vastly configurable at compile-time. The module is organized into a host library of parallel kernels, PTLib, and a core module of device-side functions, PTLibCore. The latter provides functions to generate primary rays, process path vertices, sample Next-Event Estimation and emissive surface hits at each of them, and process all the generated samples. In order to make the whole process configurable, all the functions accept three template interfaces:

a context interface, holding members describing the current path tracer state, and providing two trace methods, for scattering and shadow rays respectively. The basic path tracer state can be inherited from the PTContextBase class. On top of that, this class has to provide the following interface:

struct TPTContext
{
FERMAT_DEVICE
void trace_ray(
TPTVertexProcessor& vertex_processor,
RenderingContextView& renderer,
const PixelInfo pixel,
const MaskedRay ray,
const cugar::Vector4f weight,
const cugar::Vector2f cone,
const uint32 nee_vertex_id);
FERMAT_DEVICE
void trace_shadow_ray(
TPTVertexProcessor& vertex_processor,
RenderingContextView& renderer,
const PixelInfo pixel,
const MaskedRay ray,
const cugar::Vector3f weight,
const cugar::Vector3f weight_d,
const cugar::Vector3f weight_g,
const uint32 nee_vertex_id,
const uint32 nee_sample_id);
};
Note that for the purpose of the PTLibCore module, a given implementation is free to define the trace methods in any arbitrary manner, since the result of tracing a ray is not used directly. This topic will be covered in more detail later on.
a user defined vertex processor, determining what to do with each generated path vertex; this is the class responsible for weighting each sample and accumulating them to the image. It has to provide the following interface:

struct TPTVertexProcessor
{
// preprocess a vertex and return some packed vertex info - this is useful since this
// information bit is automatically propagated through the entire path tracing pipeline,
// and might be used, for example, to implement user defined caching strategies like
// those employed in path space filtering, where the packed info would e.g. encode a
// spatial hash.
//
// \param context the current context
// \param renderer the current renderer
// \param pixel_info packed pixel info
// \param ev the current vertex
// \param cone_radius the current cone radius
// \param scene_bbox the scene bounding box
// \param prev_vertex_info the vertex info at the previous path vertex
FERMAT_DEVICE
uint32 preprocess_vertex(
const TPTContext& context,
const RenderingContextView& renderer,
const PixelInfo pixel_info,
const EyeVertex& ev,
const float cone_radius,
const cugar::Bbox3f scene_bbox,
const uint32 prev_vertex_info);
// compute NEE weights given a vertex and a light sample
//
// \param context the current context
// \param renderer the current renderer
// \param pixel_info packed pixel info
// \param prev_vertex_info packed vertex info at the previous path vertex
// \param vertex_info packed vertex info
// \param ev the current vertex
// \param f_d the diffuse brdf
// \param f_g the glossy brdf
// \param w the current path weight
// \param f_L the current sample contribution, including the MIS weight
// \param out_w_d the output diffuse weight
// \param out_w_g the output glossy weight
// \param out_vertex_info the output packed vertex info
FERMAT_DEVICE
void compute_nee_weights(
const TPTContext& context,
const RenderingContextView& renderer,
const PixelInfo pixel_info,
const uint32 prev_vertex_info,
const uint32 vertex_info,
const EyeVertex& ev,
const cugar::Vector3f& f_d,
const cugar::Vector3f& f_g,
const cugar::Vector3f& w,
const cugar::Vector3f& f_L,
cugar::Vector3f& out_w_d,
cugar::Vector3f& out_w_g,
uint32& out_vertex_info);
// compute scattering weights given a vertex
//
// \param context the current context
// \param renderer the current renderer
// \param pixel_info packed pixel info
// \param prev_vertex_info packed vertex info at the previous path vertex
// \param vertex_info packed vertex info
// \param ev the current vertex
// \param out_comp the brdf scattering component
// \param g the brdf scattering weight (= f/p)
// \param w the current path weight
// \param out_w the output weight
// \param out_vertex_info the output vertex info
//
FERMAT_DEVICE
void compute_scattering_weights(
const TPTContext& context,
const RenderingContextView& renderer,
const PixelInfo pixel_info,
const uint32 prev_vertex_info,
const uint32 vertex_info,
const EyeVertex& ev,
const uint32 out_comp,
const cugar::Vector3f& g,
const cugar::Vector3f& w,
cugar::Vector3f& out_w,
uint32& out_vertex_info);
// accumulate an emissive surface hit
//
// \param context the current context
// \param renderer the current renderer
// \param pixel_info packed pixel info
// \param prev_vertex_info packed vertex info at the previous path vertex
// \param vertex_info packed vertex info
// \param ev the current vertex
// \param w the emissive sample weight
//
FERMAT_DEVICE
void accumulate_emissive(
const TPTContext& context,
RenderingContextView& renderer,
const PixelInfo pixel_info,
const uint32 prev_vertex_info,
const uint32 vertex_info,
const EyeVertex& ev,
const cugar::Vector3f& w);
// accumulate a NEE sample
//
// \param context the current context
// \param renderer the current renderer
// \param pixel_info packed pixel info
// \param vertex_info packed vertex info
// \param hit the hit information
// \param w_d the diffuse nee weight
// \param w_g the glossy nee weight
//
FERMAT_DEVICE
void accumulate_nee(
const TPTContext& context,
RenderingContextView& renderer,
const PixelInfo pixel_info,
const uint32 vertex_info,
const bool shadow_hit,
const cugar::Vector3f& w_d,
const cugar::Vector3f& w_g);
};
a user defined direct lighting engine, responsible to generate NEE samples. It has to provide the following interface:

struct TDirectLightingSampler
{
// preprocess a path vertex and return a user defined hash integer key,
// called <i>nee_vertex_id</i>,
// used for all subsequent NEE computations at this vertex;
// this packed integer is useful to implement things like spatial hashing, where the current
// vertex is hashed to a slot in a hash table, e.g. storing reinforcement-learning data.
//
FERMAT_DEVICE
uint32 preprocess_vertex(
const RenderingContextView& renderer,
const EyeVertex& ev,
const uint32 pixel,
const uint32 bounce,
const bool is_secondary_diffuse,
const float cone_radius,
const cugar::Bbox3f scene_bbox);
// sample a light vertex at a given slot, and return a user defined sample index,
// called <i>nee_sample_id</i>: this integer may encode any arbitrary data the sampler
// might need to later on address the generated sample in the update() method.
//
FERMAT_DEVICE
uint32 sample(
const uint32 nee_vertex_id,
const float z[3],
VertexGeometryId* light_vertex,
VertexGeometry* light_vertex_geom,
float* light_pdf,
Edf* light_edf);
// map a light vertex defined by its triangle id and uv barycentric coordinates to the
// corresponding differential geometry, computing its EDF and sampling PDF, using the
// slot information computed at the previous vertex along the path; this method is called
// on emissive surface hits to figure out the PDF with which the hits would have been
// generated by NEE.
//
FERMAT_DEVICE
void map(
const uint32 prev_nee_vertex_id,
const uint32 triId,
const cugar::Vector2f uv,
const VertexGeometry light_vertex_geom,
float* light_pdf,
Edf* light_edf);
// update the internal state of the sampler with the resulting NEE sample
// information, useful to e.g. implement reinforcement-learning strategies.
//
FERMAT_DEVICE
void update(
const uint32 nee_vertex_id,
const uint32 nee_sample_id,
const cugar::Vector3f w,
const bool occluded);
};

You'll notice this is just a slight generalization of the Light interface, providing more controls for preprocessing and updating some per-vertex information. At the moment, Fermat provides two different implementations of this interface:

DirectLightingMesh : a simple wrapper around the MeshLight class;
DirectLightingRL : a more advanced sampler based on Reinforcement Learning.

: The most important functions implemented by the PTLibCore module are:

// generate a primary ray based on the given pixel index
//
// \tparam TPTContext A path tracing context
//
// \param context the path tracing context
// \param renderer the rendering context
// \param pixel the unpacked 2d pixel index
// \param U the horizontal (+X) camera frame vector
// \param V the vertical (+Y) camera frame vector
// \param W the depth (+Z) camera frame vector
template <typename TPTContext>
FERMAT_DEVICE
MaskedRay generate_primary_ray(
TPTContext& context,
RenderingContextView& renderer,
const uint2 pixel,
cugar::Vector3f U,
cugar::Vector3f V,
cugar::Vector3f W);
// processes a NEE sample, using already computed occlusion information
//
// \tparam TPTContext A path tracing context, which must adhere to the TPTContext interface
// \tparam TPTVertexProcessor A vertex processor, which must adhere to the TPTVertexProcessor interface
//
// \param context the path tracing context
// \param vertex_processor the vertex processor
// \param renderer the rendering context
// \param shadow_hit a bit indicating whether the sample is occluded or not
// \param pixel_info the packed pixel info
// \param w the total sample weight
// \param w_d the diffuse sample weight
// \param w_g the glossy sample weight
// \param vertex_info the current vertex info produced by the vertex processor
// \param nee_vertex_id the current NEE slot computed by the direct lighting sampler
// \param nee_sample_id the current NEE sample info computed by the direct lighting sampler
template <typename TPTContext, typename TPTVertexProcessor>
FERMAT_DEVICE
void solve_occlusion(
TPTContext& context,
TPTVertexProcessor& vertex_processor,
RenderingContextView& renderer,
const bool shadow_hit,
const PixelInfo pixel_info,
const cugar::Vector3f w,
const cugar::Vector3f w_d,
const cugar::Vector3f w_g
const uint32 vertex_info = uint32(-1),
const uint32 nee_vertex_id = uint32(-1),
const uint32 nee_sample_id = uint32(-1));
// processes a path vertex, performing these three key steps:
// - sampling NEE
// - accumulating emissive surface hits
// - scattering
//
// \tparam TPTContext A path tracing context, which must adhere to the TPTContext interface
// \tparam TPTVertexProcessor A vertex processor, which must adhere to the TPTVertexProcessor interface
//
// \param context the path tracing context
// \param vertex_processor the vertex processor
// \param renderer the rendering context
// \param pixel_info the packed pixel info
// \param pixel the unpacked 2d pixel index
// \param ray the incoming ray
// \param hit the hit information
// \param w the current path weight
// \param prev_vertex_info the vertex info produced by the vertex processor at the previous vertex
// \param prev_nee_vertex_id the NEE slot corresponding to the previous vertex
// \param cone the incoming ray cone
template <typename TPTContext, typename TPTVertexProcessor>
FERMAT_DEVICE
bool shade_vertex(
TPTContext& context,
TPTVertexProcessor& vertex_processor,
RenderingContextView& renderer,
const uint32 bounce,
const PixelInfo pixel_info,
const uint2 pixel,
const MaskedRay& ray,
const Hit hit,
const cugar::Vector4f w,
const uint32 prev_vertex_info = uint32(-1),
const uint32 prev_nee_vertex_id = uint32(-1),
const cugar::Vector2f cone = cugar::Vector2f(0));

: Note that all of these are device-side functions meant to be called by individual CUDA threads. The underlying idea is that all of them might call into TPTContext trace() methods, but that the implementation of the TPTContext class might decide whether to perform the trace calls in-place, or rather enqueue them. The latter approach is called "wavefront" scheduling, and is the one favored in Fermat, as so far it has proven most efficient.

Wavefront Scheduling

: PTLib implements a series of kernels to execute all of the above functions in massively parallel waves, assuming their inputs can be fetched from some TPTContext -defined queues. In order for these wavefronts kernels to work, it is sufficient to have the TPTContext implementation inherit from the prepackaged PTContextQueues class, containing all necessary queue storage. Together with the kernels themselves, the corresponding host dispatch functions are provided as well. These are the following:

// dispatch the kernel to generate primary rays: the number of rays is defined by the resolution
// parameters provided by the rendering context.
//
// \tparam TPTContext           A path tracing context
//
// \param context               the path tracing context
// \param renderer              the rendering context
template <typename TPTContext>
void generate_primary_rays(
    TPTContext              context,
    RenderingContextView    renderer);
// dispatch the shade hits kernel
//
// \tparam TPTContext           A path tracing context, which must adhere to the TPTContext interface
// \tparam TPTVertexProcessor   A vertex processor, which must adhere to the TPTVertexProcessor interface
//
// \param in_queue_size         the size of the input queue containing all path vertices in the current wave
// \param context               the path tracing context
// \param vertex_processor      the vertex_processor
// \param renderer              the rendering context
template <typename TPTContext, typename TPTVertexProcessor>
void shade_hits(
    const uint32            in_queue_size,
    TPTContext              context,
    TPTVertexProcessor      vertex_processor,
    RenderingContextView    renderer);
// dispatch a kernel to process NEE samples using computed occlusion information
//
// \tparam TPTContext           A path tracing context, which must adhere to the TPTContext interface
// \tparam TPTVertexProcessor   A vertex processor, which must adhere to the TPTVertexProcessor interface
//
// \param in_queue_size         the size of the input queue containing all processed NEE samples
// \param context               the path tracing context
// \param vertex_processor      the vertex_processor
// \param renderer              the rendering context
template <typename TPTContext, typename TPTVertexProcessor>
void solve_occlusion(
    const uint32            in_queue_size,
    TPTContext              context,
    TPTVertexProcessor      vertex_processor,
    RenderingContextView    renderer);

: The last key function provided by this module is the one assembling all of the above into a single loop, or rather, into a complete pipeline that generates primary rays, traces them, shades the resulting vertices, traces any generated shadow and scattering rays, shades the results, and so, and so on, until all generated paths are terminated:

// main path tracing loop
//
template <typename TPTContext, typename TPTVertexProcessor>
void path_trace_loop(
    TPTContext&             context,
    TPTVertexProcessor&     vertex_processor,
    RenderingContext&       renderer,
    RenderingContextView&   renderer_view,
    PTLoopStats&            stats);

: In the next chapter we'll see how all of this can be used to write a very compact path tracer.

: 'Modern Hall, at sunset', based on a model by NewSee2l035

Next: The Path Tracer (Revisited)