June 2011

CPU Implementation of Grid-Subdividing Rasterization
REYES stands for 'Renders Everything You Ever Saw'. This is traditionally the primary rendering algorithm of Pixar and Dreamworks, but it seems that today these animation studios are moving further and further into raytracing to produce their films.

This is a simple renderer I created in C++, which is based on the basic idea of REYES. See below for an explanation of how it works. Here are some general points about it.

- Geometry state objects templated with parameters: Evaluator, Shader, and Displacer
- Evaluators: Revolution (lathe), Disc, Cylinder, and Sphere
- Displacers: Perlin (out from point, or in single given direction)
- Shaders: The Phong lighting model
- Depth buffer for hidden surface removal
- Arbitrary affine transformation
- Surface domain can be clipped to any rectangular area
- Resulting surfaces are extremely smooth, and the shading normal matches the geometric normal

My renderer was written in C++ and is template-based. True to the nature of the classic REYES algorithm, the code is very modular and a wide variety of functions can be used to produce any desired geometry with little-to-no limitation.

The main downside to this algorithm in comparison to raytracing or optimised scanline rasterization is the long rendering times. At least it is easily parallelised, due to the large amount of independent calculation and very low memory usage.

The templated renderable primitive class allows a great deal of flexibility and makes it very easy to modify shapes or add new types of shape. Similar to many of the standard C++ header classes, you have the option of using a function or an instance of a class which implements the function application operator. Shadow effects are not currently produced, but could be added by introducing depth-only rendering and projective texture lookup.

Here is the declaration for the renderable primitive class. The declarations and definitions within the class have been elided for brevity. As can be seen, there are defaults for each parameter. They simply do nothing.

template<typename Evaluator = NullEvaluator,
typename Shader = NullShader, typename Displacer = NullDisplacer>
class Primitive
{
...
};

One of the great things about writing the code in this way, is that the setter methods in the class for setting the instanced Evaluator, Shader, and Displacer only accept objects of the same type or of a descendant type. This means that it is more likely for a compilation failure to prevent the wrong object (from a user's perspective) from being set, without the need for a set of abstract classes and accompanying hierarchy.

The REYES algorithm takes the form of a pipeline with the following stages: Bound, Split, Dice, Shade, Bust, and Hide. They are described below.

- Bound - Determine the bounding volumes of the primitives in the scene.
- Split - Subdivide those primitives into smaller, diceable forms.
- Dice - For each primitive, create a grid of 'micropolygons', which are each smaller than a pixel in size.
- Shade - Evaluate lighting and shading at each vertex on the micropolygon grid.
- Bust - Generate the windings for micropolygons, perform visibility culling on them.
- Hide - Sample the micropolygons to produce an image.

The major difference between what I have written here, and how the algorithm should proceed is that my renderer does not sample the micropolygons, it plots pixel-sized dots onto the image. This means that sometimes the shapes being rendered are over-estimated and cover more pixels than they should. It also means that there is a higher chance for very small gaps to appear in the geometry. For a great piece of software which does do all of these things properly, see Aqsis (http://www.aqsis.org/). Aqsis is also RenderMan-compatible.

For my implementation, I simplified this pipeline to a recursive splitting and bounds checking. The split() function first dices the current parameter space region defined by min and max, creating a patch of points generated from the Evaluator and Displacer objects. It then compares the patch with the viewport and discards any patches which lie completely outside. Subsequent bounds checks are skipped if the patch is fully inside. The function requiresSplit() determines if the patch would have holes if it were drawn immediately, and returns the axis of greatest projection which is used to split the patch in half before recursing. If the patch is dense enough, it is drawn to the framebuffer.

void split(const Vec2& min, const Vec2& max, bool potentially_outside = true) const
{
assert(min.x < max.x && min.y < max.y);
static Fragment frags[PATCH_SIZE * PATCH_SIZE];
// evaluate shape for this patch
dice(frags, min, max);
if(potentially_outside)
{
// compare patch against viewport bounds
bool inside = false, outside = false;
for(uint i = 0; i < PATCH_SIZE * PATCH_SIZE; ++i)
{
const Fragment& f = frags[i];
if(f.proj_pos.x > Real(+1.0) || f.proj_pos.y > Real(+1.0) ||
f.proj_pos.x < Real(-1.0) || f.proj_pos.y < Real(-1.0))
outside = true;
if(f.proj_pos.x < Real(+1.0) && f.proj_pos.y < Real(+1.0) &&
f.proj_pos.x > Real(-1.0) && f.proj_pos.y > Real(-1.0))
inside = true;
}
if(outside && !inside)
return;
if(inside && !outside)
potentially_outside = false;
}
Axis axis = AXIS_None;
// test if patch is small enough yet
if(requiresSplit(frags, axis))
{
assert(axis != AXIS_None);
Vec2 split_vec = max;
split_vec[axis] = (min[axis] + max[axis]) * Real(0.5);
split(min, split_vec, potentially_outside);
split_vec[axis ^ 1] = min[axis ^ 1];
split(split_vec, max, potentially_outside);
}
else
drawPatch(frags, min, max);
}

The evaluators implement a function application operator which takes a reference to a vector which will become the position in the primitive's local space, and 2D coordinates from the normalized surface parameterization. The code below shows the complete implementation of the DiscEvaluator, which produces a flat circular shape formed from concentric rings of points. Clearly, the evaluated point rotates around the disc centre as coord.x increases, and moves toward the disc centre as coord.y decreases. All of the Evaluators I have written are similar to this, as they are all based on radial parameterizations. So long as a mapping from the unit square to the desired shape exists, it can be rendered by this system.

struct DiscEvaluator
{
Real radius;
void operator()(Vec3& pos, const Vec2& coord) const
{
const Real phi = coord.x * Real(M_PI) * Real(2.0);
pos.x = cos( phi ) * radius * coord.y;
pos.y = sin( phi ) * radius * coord.y;
pos.z = Real(0.0);
}
DiscEvaluator(): radius(Real(1.0)) { }
};

Below is an image of a shape created by displacing a disc. This image also demonstrates the Phong shader, which can be used to create matte materials with a dull appearance, or glossy materials with an appearance similar to plastic. The code for this shading is shown further below.

As mentioned above, the cylinder and sphere shapes are evaluated similarly. CylinderEvaluator uses a function lerp() which is used identically to the one available in HLSL and Cg. The GLSL equivalent is mix(). It takes two arbitrary values a and b to be interpolated and a bias value c in [0, 1], returning bc + a(1-c). This function (and others) along with the scalar, vector and matrix types used here come from libraries I have written. I find such things to be very useful indeed!

struct CylinderEvaluator
{
Real radius, t0, t1;
void operator()(Vec3& pos, const Vec2& coord) const
{
const Real phi = coord.x * Real(M_PI) * Real(2.0);
pos.x = cos( phi ) * radius;
pos.y = sin( phi ) * radius;
pos.z = lerp( t0, t1, coord.y );
}
CylinderEvaluator(): radius(Real(1.0)), t0(Real(-1.0)), t1(Real(+1.0)) { }
};

struct SphereEvaluator
{
Real radius;
void operator()(Vec3& pos, const Vec2& coord) const
{
const Real theta = coord.y * Real(M_PI),
phi = coord.x * Real(M_PI) * Real(2.0);
pos.x = cos( phi ) * sin( theta ) * radius;
pos.y = sin( phi ) * sin( theta ) * radius;
pos.z = cos( theta ) * radius;
}
SphereEvaluator(): radius(Real(1.0)) { }
};

The RevolutionEvaluator is a little different, as it uses an additional indirect function to evaluate a profile which is swept around a line to form the surface. It is easiest to imagine this as the cylinder again, but with each ring in the cylinder given it's own radius. The radius value comes from the profile function.

struct RevolutionEvaluator
{
typedef Real (*ProfileFunc)(Real);
Real t0, t1;
ProfileFunc profile;
void operator()(Vec3& pos, const Vec2& coord) const
{
assert(profile != NULL);
const Real phi = coord.x * Real(M_PI) * Real(2.0);
pos.x = cos( phi ) * profile( coord.y );
pos.y = sin( phi ) * profile( coord.y );
pos.z = lerp( t0, t1, coord.y );
}
RevolutionEvaluator(): t0(Real(-1.0)), t1(Real(+1.0)), profile(NULL) { }
};

The PerlinDisplacer works very simply, so it's code is not shown here. Given a frequency and amplitude, it calls a function perlin2D with the coordinate parameters multiplied by the frequency, and the returned value multiplied by the amplitude. The code I wrote to create the Perlin noise function is shown below. There are two arrays used, gradients[] and permutations[]. The first is an array of random unit-length 2D vectors, and the second is a randomly shuffled array of all integers in [0, 127]. The quintic spline is used to create a smoother output, and reduces certain artefacts which appear when differentiating the output. See http://mrl.nyu.edu/~perlin/paper445.pdf - Ken Perlin, Improving Noise. Note that when I was working on this project, I did not use all of the suggestions made in this paper. This does not mean that I disagree with them - I'd like to incorporate them in the future. There are also other opportunities to improve this function, as many values are needlessly recomputed.

// interpolant
static inline Real spline(Real t)
{
#if PERLIN_USE_QUINTIC
// quintic
const Real t2 = t * t, t3 = t2 * t, t4 = t2 * t2, t5 = t3 * t2;
return Real(6.0) * t5 - Real(15.0) * t4 + Real(10.0) * t3;
#else
// cubic
return Real(3.0) * t * t - Real(2.0) * t * t * t;
#endif
}
static Real perlin2D(const Vec2& coord)
{
const int x = int(coord.x), y = int(coord.y);
// pick a pseudo-random gradient for each corner
const Vec2 &grad0 = gradients[((x+0) & PERLIN_PERM_MASK) + permutations[(y+0) & PERLIN_PERM_MASK]];
const Vec2 &grad1 = gradients[((x+1) & PERLIN_PERM_MASK) + permutations[(y+0) & PERLIN_PERM_MASK]];
const Vec2 &grad2 = gradients[((x+0) & PERLIN_PERM_MASK) + permutations[(y+1) & PERLIN_PERM_MASK]];
const Vec2 &grad3 = gradients[((x+1) & PERLIN_PERM_MASK) + permutations[(y+1) & PERLIN_PERM_MASK]];
// calc dot products for those gradients
const float d0 = grad0.dot( coord - Vec2(Real(x+0), Real(y+0)));
const float d1 = grad1.dot( coord - Vec2(Real(x+1), Real(y+0)));
const float d2 = grad2.dot( coord - Vec2(Real(x+0), Real(y+1)));
const float d3 = grad3.dot( coord - Vec2(Real(x+1), Real(y+1)));
const float u = spline(coord.x - Real(x)), v = spline(coord.y - Real(y));
// blend weighted contributions
return (d0*(Real(1.0)-u) + d1*u)*(Real(1.0)-v) + (d2*(Real(1.0)-u) + d3*u)*v;
}

Here is the code I wrote to produce the lighting for each point on a surface. The position attribute of the ShadePoint is in viewspace, so it is the vector from the viewpoint to the point being shaded. This means it is equivalent to the direction of a ray cast from the viewpoint through the pixel projected by the shaded point, and can be reflected for the purpose of producing a specular highlight (as is done here).

struct PhongShader
{
Real spec_exponent, spec_intensity;
Vec3 light_pos, col_diff, col_spec;
Vec3 operator()(const ShadePoint& sp) const
{
const Vec3 l = normalize(light_pos - sp.pos),
r = normalize(reflect(sp.pos, sp.norm));
return col_spec * pow(saturate(r.dot(l)), spec_exponent) * spec_intensity +
col_diff * saturate(sp.norm.dot(l));
}
};

Separate colours can be set for the diffuse reflection and specular highlight. There is no distance attenuation.