Fermat
An Overture

Top: Contents 'Salle de bain, with coffee', based on a model by nacimus
Building a physically based renderer is a subject that has already been covered by a couple great books (for example, the fantastic PBR). Building a high performance, massively parallel renderer is a slightly different topic, that so far has not received much attention. While Fermat doesn't pretend to be a full featured rendering system, it tries to show how to go about writing one. Here, we will try to go a little bit into its details.
Like many books out there, this overture should probably start from the ground up, describing the very basics of geometry on which the renderer is built. Since many books have already covered that subject, however, and as the vector and matrix classes used in Fermat are likely not very different from any of the others, we will just skip that part and jump onto the more interesting, rendering-related stuff. As a matter of fact, Fermat itself relies on a separate library for all of those ancillary classes and utilities: CUGAR.
The one thing we will need, however, is a short digression into the host and device dichotomy present throughout all of Fermat, which is, mostly, a GPU renderer. In order to understand the consequences of this dichotomy, and in particular the one between the host and device memory spaces, you should go straight to this page and come back when you are done with it: Host & Device.
So what is Fermat, exactly? Perhaps, the most concise explanation is that it is a collection of path samplers of various kinds. Like all modern physically based renderers, all of Fermat's internal rendering algorithms follow the same old recipe: throw some more or less random numbers, sample more or less interesting light paths, connecting the emitters to the eye of the virtual observer, and calculate and average the radiance that flows through them. This, plus or minus some denoising.
In fact, the only useful bit of basic geometry we need in order to proceed is the concept of differential vertex geometry needed to represent light path vertices. Fermat employs the following simple representation:
// Vertex geometry.
//
// Encodes the local differential surface geometry at a point, including its tangent, binormal
// and normal, as well the local texture coordinates.
//
{
cugar::Vector3f normal_s; // shading normal
cugar::Vector3f normal_g; // geometric normal
cugar::Vector3f tangent; // local tangent
cugar::Vector3f binormal; // local binormal
cugar::Vector3f position; // local position
float padding; // some padding
cugar::Vector4f texture_coords; // local texture coordinates (2 sets)
cugar::Vector2f lightmap_coords; // local lightmap coordinates
};
The next question we need to answer when writing a renderer is: how exactly do you sample light paths? That turns out to be the subject of most rendering research of the last 20 years. In fact, if you do not know the basics already, the best possible source of information is still Eric Veach's master thesis from 1997, a work of rare and exceptional clarity, that literally laid the groundwork for this entire field. Again, Fermat just tries to summarize a few of the most widely spread and a few of the most recent algorithms, while focusing on doing that efficiently.
Getting to the point where we can actually describe how that is performed will require some time, but overall it all starts from four basic ingredients:
• the virtual camera,
• the Bidirectional Scattering Distribution Function, or BSDF,
• the light sources, or emitters,
• and obviously, the scene geometry, or meshes,
And this is exactly what we'll cover in the next few sections.

# The Mesh Geometry

If we want to see something in our pictures, we need some geometry for light to bounce against. In order to keep things simple, Fermat supports only one type: triangle meshes. The internal representation is the fairly typical one of indexed triangles, where the vertices together with their normals and texture coordinates are given in separate arrays, and yet another array provides the list of triangles as triplets of indices into the vertex and attribute lists.
In practice it is all encapsulated behind a few interfaces. The first two are the classes reponsible for holding and owning the actual mesh storage, used on the host-side of things (i.e. on the CPU): MeshStorage, to allocate and hold a mesh in host memory, and DeviceMeshStorage, to allocate and hold a mesh in device memory (i.e. the GPU). The third interface, MeshView, is what we call a view of the mesh: a thin class used to simply access the mesh representation, without actually owning any of its data - in practice, the moral equivalent of a pointer. This is what can be passed to device kernels to retrieve any of the relative information (without the need to dereference an actual pointer, which would require an expensive host memory access).
Without going into the details of its internals, we can list the free functions which provide access to its raw vertex data:
// helper method to fetch a vertex
//
FERMAT_HOST_DEVICE FERMAT_FORCEINLINE
const MeshView::vertex_type& fetch_vertex(const MeshView& mesh, const uint32 vertex_idx);
// helper method to fetch a normal vertex
//
FERMAT_HOST_DEVICE FERMAT_FORCEINLINE
const MeshView::normal_type& fetch_normal(const MeshView& mesh, const uint32 vertex_idx);
// helper method to fetch a texture coordinate vertex
//
FERMAT_HOST_DEVICE FERMAT_FORCEINLINE
const MeshView::texture_coord_type& fetch_tex_coord(const MeshView& mesh, const uint32 vertex_idx);
the functions to access the triangle lists:
// helper method to fetch the vertex indices of a given triangle
//
FERMAT_HOST_DEVICE FERMAT_FORCEINLINE
int3 load_vertex_triangle(const int* triangle_indices, const uint32 tri_idx);
// helper method to fetch the normal indices of a given triangle
//
FERMAT_HOST_DEVICE FERMAT_FORCEINLINE
int3 load_normal_triangle(const int* triangle_indices, const uint32 tri_idx);
// helper method to fetch the texture coordinate indices of a given triangle
//
FERMAT_HOST_DEVICE FERMAT_FORCEINLINE
int3 load_texture_triangle(const int* triangle_indices, const uint32 tri_idx);
and the utilities to compute useful information about local triangle geometry:
// return the area of a given primitive
//
FERMAT_HOST_DEVICE inline
float prim_area(const MeshView& mesh, const uint32 tri_id);
// return the differential geometry of a given point on the mesh, specified by a (prim_id, uv) pair
//
FERMAT_HOST_DEVICE inline
const MeshView& mesh,
float* pdf = 0);
// return the interpolated position at a given point on the mesh, specified by a (prim_id, uv) pair
//
FERMAT_HOST_DEVICE inline
cugar::Vector3f interpolate_position(const MeshView& mesh, const VertexGeometryId v, float* pdf = 0);
// return the interpolated normal at a given point on the mesh, specified by a (prim_id, uv) pair
//
FERMAT_HOST_DEVICE inline
cugar::Vector3f interpolate_normal(const MeshView& mesh, const VertexGeometryId v, const float v);
A complete list of the available functions can be found in the MeshModule.
Notice how each point on the mesh surface is addressed by a small data structure, that compactly encodes all the necessary information needed to uniquely identify the point:
//
// Encodes the minimal amount of information needed to represent a point on a surface,
// or a <i>hit</i> in ray tracing parlance
//
{
uint32 prim_id;
FERMAT_HOST_DEVICE
FERMAT_HOST_DEVICE
VertexGeometryId(const uint32 _prim_id, const float _u, const float _v) : prim_id(_prim_id), uv(_u, _v) {}
FERMAT_HOST_DEVICE
VertexGeometryId(const uint32 _prim_id, const cugar::Vector2f _uv) : prim_id(_prim_id), uv(_uv) {}
};
In fact, one of the tricks to high performance rendering is tightly packing all information, so as to consume as little bandwith and on-chip memory as possible, and this is just one of the many examples you'll find in Fermat.

# The Camera Model

Simulating a realistic camera accurately (as needed for example to match live action film) can by itself be a rather complex subject. Currently, Fermat doesn't attempt to do that, and simply models an infinitely thin pinhole camera. So simple, in fact, that it can be described by a handful of vectors and a single scalar:
struct Camera
{
float3 eye; // the eye position
float3 aim; // the aim position
float3 up; // a vector specifying where the upwards direction is relative to the camera frame
float3 dx; // a vector specifying where the right is relative to the camera frame
float fov; // the field of view
};
Together with the camera itself, Fermat also provides a utility CameraSampler class:
// A sampler for the pinhole camera model
//
{
// empty constructor
//
FERMAT_HOST_DEVICE
FERMAT_HOST_DEVICE
CameraSampler(const Camera& camera, const float aspect_ratio);
// sample a direction from normalized device coordinates (NDC) in [0,1]^2
//
FERMAT_HOST_DEVICE
// compute the direction pdf
//
// \param dir the given direction
// \param projected whether to return the pdf in projected solid angle or solid angle measure
FERMAT_HOST_DEVICE
float pdf(const cugar::Vector3f direction, const bool projected = false) const;
// invert the camera direction sampler
//
// \param dir the given direction
// \return the NDC coordinates corresponding to the given direction
FERMAT_HOST_DEVICE
// invert the camera direction sampler and compute its projected solid angle pdf
//
// \param dir the given direction
// \param projected whether to return the pdf in projected solid angle or solid angle measure
// \return the NDC coordinates corresponding to the given direction
FERMAT_HOST_DEVICE
cugar::Vector2f invert(const cugar::Vector3f dir, float* pdf_proj) const;
cugar::Vector3f U; // camera space +X axis in world coords
cugar::Vector3f V; // camera space +Y axis in world coords
cugar::Vector3f W; // camera space +Z axis in world coords
float W_len; // precomputed length of the W vector
float square_focal_length; // square focal length
};

# The BSDF Model

Again, a whole book could be easily dedicated to the subject of properly simulating realistic BSDFs. Fermat does take some shortcuts there, and focuses on a single, monolithic, layered BSDF model. It is very simple, and yet expressive enough to represent a decent spectrum of the materials we see in everday's life. It contains four basic components:
• a diffuse reflection component
• a diffuse transmission component
• a glossy reflection component layered on top of the diffuse layer
• a glossy transmission component layered on top of the diffuse layer
• a clearcoat layer on top of all of the above
The diffuse components are purely Lambertian, while the glossy components are based on the GGX model with Smith's joint masking-shadowing function described in:

[Heitz 2014, "Understanding the Masking-Shadowing Function in Microfacet-Based BRDFs"]

The interaction between the top clearcoat layer and the inner layers below that is modeled approximately, computing only the Fresnel transmission factor to enter the clearcoat. The refraction of the incident and outgoing directions is not modeled explicitly as in the approach described in "Arbitrarily Layered Micro-facet Surfaces" by Weidlich and Wilkie - as that model results in severe energy loss due to the fact it simulates single-scattering only. It is also questionable whether that would be more correct, given that accounting for multiple-scattering through the various layers can be modeled without altering the final lobe directions, and while only changing the statistics of the inner layer distributions, as described in: "Efficient Rendering of Layered Materials using an Atomic Decomposition with Statistical Operators", Laurent Belcour - in ACM Transactions on Graphics, Association for Computing Machinery, 2018. While we plan to adopt the full machinery of the latter in the future, we currently just crudely approximate it.
The inner layers of the BSDF use Fresnel coefficients to determine how much light undergoes glossy reflection, and how much undergoes transmission. Part of the radiance transmitted from the upper glossy layer undergoes diffuse scattering. The interaction between the glossy layer and the underlying diffuse layer is again modeled in a simplified manner, as if the layer was infinitely thin and the diffusely reflected particles were not interacting again with the upper layer.
Its basic interface looks like this:
struct Bsdf
{
// component type bitmasks
//
{
kAbsorption = 0u,
kDiffuseReflection = 0x1u,
kDiffuseTransmission = 0x2u,
kGlossyReflection = 0x4u,
kGlossyTransmission = 0x8u,
kClearcoatReflection = 0x10u,
kDiffuseMask = 0x3u,
kGlossyMask = 0xCu,
kReflectionMask = 0x5u,
kTransmissionMask = 0xAu,
kAllComponents = 0xFFu
};
typedef cugar::LambertTransBsdf diffuse_trans_component;
typedef cugar::LambertBsdf diffuse_component;
typedef cugar::GGXSmithBsdf glossy_component;
// evaluate the BSDF f(V,L)
//
// \param geometry the local differential geometry
// \param in the incoming direction
// \param out the outgoing direction
// \param components the components to consider
FERMAT_FORCEINLINE FERMAT_HOST_DEVICE
const cugar::DifferentialGeometry& geometry,
const cugar::Vector3f in,
const cugar::Vector3f out,
const ComponentType components) const
// evaluate the total projected probability density p(V,L) = p(L|V)
//
// \param geometry the local differential geometry
// \param in the incoming direction
// \param out the outgoing direction
// \param measure the spherical measure to use
// \param RR indicate whether to use Russian-Roulette or not
// \param components the components to consider
FERMAT_FORCEINLINE FERMAT_HOST_DEVICE
float p(
const cugar::DifferentialGeometry& geometry,
const cugar::Vector3f in,
const cugar::Vector3f out,
const cugar::SphericalMeasure measure,
const bool RR,
const ComponentType components) const
// sample an outgoing direction
//
// \param geometry the local differential geometry
// \param z the incoming direction
// \param in the incoming direction
// \param out_comp the output component
// \param out the outgoing direction
// \param out_p the output solid angle pdf
// \param out_p_proj the output projected solid angle pdf
// \param out_g the output sample value = f/p_proj
// \param RR indicate whether to use Russian-Roulette or not
// \param evaluate_full_bsdf indicate whether to evaluate the full BSDF, or just an unbiased estimate
// \param components the components to consider
FERMAT_HOST_DEVICE FERMAT_FORCEINLINE
bool sample(
const cugar::DifferentialGeometry& geometry,
const float z,
const cugar::Vector3f in,
ComponentType& out_comp,
float& out_p,
float& out_p_proj,
bool RR,
bool evaluate_full_bsdf,
const ComponentType components) const
};
In the actual Bsdf class, a few more methods are present to calculate f() and p() at the same time, and to calculate both component-by-component, all in one go, as well as auxiliary methods to compute the Fresnel weights associated with each layer.

# The Light Source Model

In nature, light sources are just emissive geometry. Fermat supports both emissive geometry and a bunch of other primitive light sources, including some that have a singular distribution (e.g. directional lights), or others that provide good analytic sampling algorithms. All of them respond to a single Light interface that provides basic methods to point-sample the light source surface, and query the Emission Distribution Function, or EDF, at each point. The basic interface is the following:
struct Light
{
// sample a point on the light source, given 3 random numbers
//
// \param Z the input random numbers
// \param prim_id the output primitive index, in case the light is made of a mesh
// \param uv the output uv coordinates on the sampled primitive
// \param geom the output sample's differential geometry
// \param pdf the output sample's area pdf
// \param edf the output sample's EDF
//
// \return true iff the pdf is singular
FERMAT_HOST_DEVICE
bool sample(
const float* Z,
uint32_t* prim_id,
float* pdf,
Edf* edf) const;
// sample a point on the light source given a selected shading point (or receiver)
//
// \param p the input shading point
// \param Z the input random numbers
// \param prim_id the output primitive index, in case the light is made of a mesh
// \param uv the output uv coordinates on the sampled primitive
// \param geom the output sample's differential geometry
// \param pdf the output sample's area pdf
// \param edf the output sample's EDF
//
// \return true iff the pdf is singular
FERMAT_HOST_DEVICE
bool sample(
const cugar::Vector3f p,
const float* Z,
uint32_t* prim_id,
float* pdf,
Edf* edf) const;
// intersect the given ray with the light source
//
// \param ray the input ray
// \param uv the output uv coordinates on the sampled primitive
// \param t the output ray intersection distance
//
FERMAT_HOST_DEVICE
void intersect(const Ray ray, float2* uv, float* t) const;
// map a (prim,uv) pair to a surface element and compute the corresponding edf/pdf
//
// \param prim_id the input primitive index, in case the light is made of a mesh
// \param uv the input uv coordinates on the sampled primitive
// \param geom the output sample's differential geometry
// \param pdf the output sample's area pdf
// \param edf the output sample's EDF
//
FERMAT_HOST_DEVICE
void map(const uint32_t prim_id, const cugar::Vector2f& uv, VertexGeometry* geom, float* pdf, Edf* edf) const;
// map a (prim,uv) pair and a (precomputed) surface element to the corresponding edf/pdf
//
// \param prim_id the input primitive index, in case the light is made of a mesh
// \param uv the input uv coordinates on the sampled primitive
// \param geom the input sample's differential geometry
// \param pdf the output sample's area pdf
// \param edf the output sample's EDF
//
FERMAT_HOST_DEVICE
void map(const uint32_t prim_id, const cugar::Vector2f& uv, const VertexGeometry& geom, float* pdf, Edf* edf) const;
};
Currently, the only supported EDF model is a simple lambertian emitter, with an interface analogous to the Bsdf class (following Veach's recommended practice of treating sources and sensors as scattering events, assuming the incident direction to an EDF is a virtual source vector where all light comes from, see Chapter 8.3.2.1, pp. 235-237 in his thesis).

Next: Ray Tracing Contexts