Fast 2D Lights and Shadows with WebGL


Back in 2014, I wrote an article on how to implement 2D omnidirectional shadow mapping by only using ThreeJS and standard WebGL resources. That technique consisted of modeling each light object with three perspective cameras with a field of view of 120⁰ each. Similarly to the classical shadow mapping algorithm, the scene depth was projected into these cameras, and later I would render the scene, using the shadow maps to check whether or not each fragment should be lit.

That technique had two major downsides. Firstly, the decomposition of each light into three cameras meant the need for three render calls for each light, which is not ideal in terms of performance. Secondly, the projection planes of these cameras are combined into one equilateral triangle centered at the light position. Therefore, if a light is positioned in a place where its triangle intersects with an obstacle, then its field of view will be limited to the part of this triangle that is outside of the obstacle.

The previous method was far more susceptible to reduction of the light FOV when they were positioned close to an obstacle.

If we can rely on the extension EXT_frag_depth (which is was turned into a standard feature of WebGL2), then it is possible to improve this technique both in terms of quality and performance. Furthermore, by fixing the maximum amount of lights, it is also possible to greatly increase rendering performance by avoiding multiple render passes.

The purpose of this article is to describe both improvements in detail. If you are not familiar with this technique, you may find the previous article a good source for other details.

Method Overview

For each light to be drawn, we perform one render pass of the scene into a 1D circular buffer. The output of this stage is one circular scene depth map per light, which are used by one final render pass. The last render pass uses the circular depth maps to determine whether or not each fragment is lit.

Given the number of lights n, the number of render passes of the new method is n+1, against 4n from the previous method.

Projecting the scene into a circle

The depth map of the 2D scene with respect to the omnidirectional light can be generated by projecting the scene depth into a one dimensional circular buffer centered at the light position, as shown below.


In order for this approach to work, we must not compute the projected coordinates of the primitive at the vertex shader, otherwise we will run into a problem described in the section “A naive approach” of the previous article. This means that the vertex shader must always force the fragment shader program to run for all pixel coordinates of the circular buffer.

At the fragment shader, we must receive both points \mathbf{a} and \mathbf{b} that define the line primitive being drawn (determined in light coordinates, that is, their coordinates with respect to the light center), as well as the angular position of the current fragment in the circular buffer -\pi \leq \alpha \leq \pi.

The goal of each fragment is to answer two questions:

  • Does the line that starts at the center of the circle and passes through the current angular position intersect with the line primitive being drawn?
  • If so, what is the distance d from this intersection to the circle center?

Let’s assume that the answer to the first question is yes. Then, there is a distance d > 0 and an interpolation index 0 \leq l \leq 1 such that

d\begin{bmatrix}\cos(\alpha)\\ \sin(\alpha)\end{bmatrix}= l(\mathbf{b}-\mathbf{a})+\mathbf{a}. \ \ \ \ \ (I)

The solution to this system is given by

l=\begin{cases} \frac{\mathbf{a}_x-\sec(\alpha)\mathbf{a}_y}{\sec(\alpha)(\mathbf{b}_y-\mathbf{a}_y)+\mathbf{a}_x-\mathbf{b}_x} & \text{if } \sin(\alpha) \neq 0 \\ &\\ \frac{\mathbf{a}_y-\tan(\alpha)\mathbf{a}_x}{\tan(\alpha)(\mathbf{b}_x-\mathbf{a}_x)+\mathbf{a}_y-\mathbf{b}_y} & \text{otherwise}, \end{cases}

d=\begin{cases} \frac{l(\mathbf{b}_y-\mathbf{a}_y)+\mathbf{a}_y}{\sin(\alpha)} & \text{if } \sin(\alpha) \neq 0 \\ &\\ \frac{l(\mathbf{b}_x-\mathbf{a}_x)+\mathbf{a}_x}{\cos(\alpha)} & \text{otherwise}, \end{cases}

Note that the conditional formula exists to avoid division by zero. It can be avoided by choosing a buffer size such that no fragment will ever have coordinates 90° or -90°.

By looking at the Equation (I) and its constraints, we can tell whether or not the current fragment intersects with the primitive being drawn. If any of the conditions d > 0 and 0 \leq l \leq 1 are not met, then the current fragment does not intersect with the line primitive and thus can be discarded.

Note that numerical imprecision may arise close to the borders of obstacles, which could lead to rendering artifacts. Therefore, we also discard fragments if d \leq 10^{-3} .

Modeling the scene

During the circular depth map generation, the whole scene is going to be processed at the fragment shader. Furthermore, all obstacles in the scene are going to be rendered as lines primitives at this stage.

This means we need to force every line primitive to cover the entire NDC range [-1, 1] of the circle at the vertex shader. In addition to that, our fragment shader of the depth calculation stage needs to know the coordinates of both points \mathbf{a} and \mathbf{b}.

To address this issue, we render gl.LINES primitives with a Vertex Buffer Object with the following structure:

\begin{bmatrix} -1 & \mathbf{a_1}_x & \mathbf{a_1}_y& \mathbf{b_1}_x & \mathbf{b_1}_y \\ 1 & \mathbf{a_1}_x & \mathbf{a_1}_y& \mathbf{b_1}_x & \mathbf{b_1}_y \\ -1 & \mathbf{a_2}_x & \mathbf{a_2}_y& \mathbf{b_2}_x & \mathbf{b_2}_y \\ 1 & \mathbf{a_2}_x & \mathbf{a_2}_y& \mathbf{b_2}_x & \mathbf{b_2}_y \\ \vdots & \vdots & \vdots & \vdots & \vdots \\ -1 & \mathbf{a_n}_x & \mathbf{a_n}_y& \mathbf{b_n}_x & \mathbf{b_n}_y \\ 1 & \mathbf{a_n}_x & \mathbf{a_n}_y& \mathbf{b_n}_x & \mathbf{b_n}_y \\ \end{bmatrix}

The first element in our VBO corresponds to the NDC coordinate of the vertex to be rendered, and therefore is either -1 or 1. Note that the angular coordinate for each fragment can be derived from this attribute by multiplying it by \pi and interpolating it.

The two next elements correspond to the coordinates of the first vertex in the line primitive, while the last two elements are the coordinates of the second vertex in the line primitive. They are repeated for every vertex so that they can be forwarded from the vertex shader to the fragment shader without any changes due to interpolation.

Rendering the scene

For each fragment in the scene, we project its coordinates into the light circle. Given the fragment’s world coordinate \mathbf{f} and the light position \mathbf{l} , the fragment angular coordinate is given by

\alpha_f=\arctan(\mathbf{f}_y-\mathbf{l}_y, \mathbf{f}_x-\mathbf{l}_x).

Then, we normalize \alpha_f to the range [0, 1] and sample the circular depth map at the normalized coordinate. The sampled value corresponds to the closest obstacle projected at that position (or infinity, if there was none). If the fragment being currently processed is closer to the light than the sampled distance, then this fragment is lit. Otherwise, we know that it is hidden by at least one obstacle.


Rendering multiple lights

In the previous method, it was possible to render any arbitrary number of lights, with the limit being the number of offscreen render targets allowed by the GPU. The disadvantage was a drastic performance impact, ironically proportional to the number of lights being rendered.

Since the goal of the method currently being described is to have a high rendering performance, we no longer use ping pong offscreen rendering to handle multiple lights. Instead, we handle all lights in a single render pass iteratively within a for loop at the fragment shader of the final rendering material.

Final thoughts

The improved technique discussed in this article solves a few rendering artifacts present in the original approach, both inherent to the underlying shadow mapping method and numerical issues. Furthermore, it provides a significant performance improvement: On a mobile GTX 1060, adding 10 lights made the previous method run at 40fps, while the new method runs swiftly at 60fps, a 50% speedup.

You can run the WebGL demo of the improved algorithm here, as well as compare it with the equivalent of the older version here.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s