2D Face Modeling with JavaScript


One common algorithm to perform Face Morphing requires fitting a polygonal mesh on top of all input face images. This must be done in a way such that each vertex converge to the same physical spot of all faces – such as the eye contour, nose tip, chin, etc. Having such a mesh aligned to a face can also be useful for other purposes, from caricaturing to beautification, by performing non-rigid modifications to the original face.

In this post, I’ll describe how I implemented a simple face distortion tool using only JavaScript and HTML5.

Continue reading

Numerically Stable Homography Computation


Suppose you want to find a Homography transform that maps a set of 2D coordinates to another arbitrary set of 2D coordinates. This problem has many applications, such as image registration based on keypoint matching. This post will describe a few steps to make sure the homography estimation is numerically stable and can be performed using 32bit floating point coordinates.

Continue reading

Fast 2D Lights and Shadows with WebGL


Back in 2014, I wrote an article on how to implement 2D omnidirectional shadow mapping by only using ThreeJS and standard WebGL resources. That technique consisted of modeling each light object with three perspective cameras with a field of view of 120⁰ each. Similarly to the classical shadow mapping algorithm, the scene depth was projected into these cameras, and later I would render the scene, using the shadow maps to check whether or not each fragment should be lit.

That technique had two major downsides. Firstly, the decomposition of each light into three cameras meant the need for three render calls for each light, which is not ideal in terms of performance. Secondly, the projection planes of these cameras are combined into one equilateral triangle centered at the light position. Therefore, if a light is positioned in a place where its triangle intersects with an obstacle, then its field of view will be limited to the part of this triangle that is outside of the obstacle.

The previous method was far more susceptible to reduction of the light FOV when they were positioned close to an obstacle.

If we can rely on the extension EXT_frag_depth (which is was turned into a standard feature of WebGL2), then it is possible to improve this technique both in terms of quality and performance. Furthermore, by fixing the maximum amount of lights, it is also possible to greatly increase rendering performance by avoiding multiple render passes.

The purpose of this article is to describe both improvements in detail. If you are not familiar with this technique, you may find the previous article a good source for other details.

Continue reading

Introduction to non-linear optimization for Image Registration

Aerial view of Belo Horizonte. The frames were taken from Google Maps.

The problem of image registration consists of aligning two images taken from different perspectives in order to obtain a larger visual representation of some large structure. For instance, one can stitch several overlapping aerial pictures to obtain the map of an entire city. Another example is the creation of panoramic pictures, which consists of stitching pictures taken by rotating the camera. The registration problem is also present in the medical fields. It is required to register medical imagery in order to accurately perform diagnostics as well as detect changes during several types of treatments.

With such a broad spectrum of applications, one can expect the registration problem to be presented with countless particularities. The signals being registered can have an arbitrary amount of dimensions; they might not even contain the same representation of the structure being mapped in the first place. The data from an MRI (representing soft tissues) can be registered with CT scans (more sensitive to high density objects); one may also attempt to register satellite imagery with cartographic maps.

The objective of this article is to gently introduce how the registration problem can be addressed with a non-linear optimization framework, from the Taylor series until the formulation of the Homography Jacobian and Hessian. In this article, we will focus on rigid registration, although the concepts herein can also be applied to the non-rigid case.

Continue reading

Configuring GDB ImageWatch to visualize custom buffer types


gdb-imagewatch is a tool that can be used in conjunction with GDB to visualize in-memory buffers during debug sessions. In the most typical scenario, it’s used to visualize the contents of an OpenCV image being processed by the debugged application. To help with the debugging task, it has many features inspired by the Visual Studio ImageWatch plugin, such as pixel value visualization, support for multiple buffer types and dumping buffers to files; while also containing other helpful features such as buffer rotation.

Since the gdb-imagewatch plugin was created with extensibility in mind, it is very easy to adapt it so that it can support as many buffer types as needed by the programmer. In this post, I will show how to adapt the plugin so that it can display the contents of a custom C++ Image class.

Continue reading

2D omnidirectional shadow mapping with Three.Js

A 2D scene is visualized by a camera placed on top of it. Black squares represent obstacles that light cannot transpose. In this scene, we have three light sources.

In this post, I will talk about how I implemented a 2D omnidirectional shadow mapping algorithm (that is, given a 2D world with obstacles defined by polygon and a set of point lights, render this world by taking into account the shadows produced by the interaction between the lights and the obstacles). You can check out a live demo here (WebGL support is required to run it).

Although I’ve already implemented 3D shadow maps for spot lights in the past, the present case opens up space for exciting new ways of approaching this problem. As far as I know, Three.Js’ shadow implementations are specifically designed for 3D scenes. Moreover, I haven’t seen any references to omnidirectional shadow maps in this library (which doesn’t mean it’s not there; it might just not be documented). Therefore, instead of using the default implementation of shadow maps, I decided to code my own and explore the new possibilities I had in mind, including the use of a 1D depth map.

Continue reading

A mathematical overview of UVA 855

Lunch in Grid City, or UVA 855, is a problem created for the Portuguese regional of the ACM-ICPC. The premise of the problem is that you are given a set of coordinates S=\{\mathbf{p}_1,\mathbf{p}_2,\dots \mathbf{p}_n\}, n<=50000 in a grid, and have to find a coordinate \mathbf{c} that minimizes the sum of the Manhattan distances between all points in S and \mathbf{c}. Formally:

\mathbf{c}=\displaystyle{\arg\!\min}_{p} \displaystyle\sum_{i=1}^{n}{|\mathbf{p}_{i_x}-\mathbf{p}_x|+|\mathbf{p}_{i_y}-\mathbf{p}_y|}

A naive approach would be to compute the distances of all points in S to all possible points in the grid, and pick up the point with smaller score. This has complexity \mathcal{O}(san), where s and a are the height and width of the grid, respectively. Since 1 \leq s,a \leq 1000, this approach would obviously not work.

Continue reading