Suppose you want to find a Homography transform that maps a set of 2D coordinates to another arbitrary set of 2D coordinates. This problem has many applications, such as image registration based on keypoint matching. This post will describe a few steps to make sure the homography estimation is numerically stable and can be performed using 32bit floating point coordinates.
Back in 2014, I wrote an article on how to implement 2D omnidirectional shadow mapping by only using ThreeJS and standard WebGL resources. That technique consisted of modeling each light object with three perspective cameras with a field of view of 120⁰ each. Similarly to the classical shadow mapping algorithm, the scene depth was projected into these cameras, and later I would render the scene, using the shadow maps to check whether or not each fragment should be lit.
That technique had two major downsides. Firstly, the decomposition of each light into three cameras meant the need for three render calls for each light, which is not ideal in terms of performance. Secondly, the projection planes of these cameras are combined into one equilateral triangle centered at the light position. Therefore, if a light is positioned in a place where its triangle intersects with an obstacle, then its field of view will be limited to the part of this triangle that is outside of the obstacle.
If we can rely on the extension EXT_frag_depth (which is was turned into a standard feature of WebGL2), then it is possible to improve this technique both in terms of quality and performance. Furthermore, by fixing the maximum amount of lights, it is also possible to greatly increase rendering performance by avoiding multiple render passes.
The purpose of this article is to describe both improvements in detail. If you are not familiar with this technique, you may find the previous article a good source for other details.
The problem of image registration consists of aligning two images taken from different perspectives in order to obtain a larger visual representation of some large structure. For instance, one can stitch several overlapping aerial pictures to obtain the map of an entire city. Another example is the creation of panoramic pictures, which consists of stitching pictures taken by rotating the camera. The registration problem is also present in the medical fields. It is required to register medical imagery in order to accurately perform diagnostics as well as detect changes during several types of treatments.
With such a broad spectrum of applications, one can expect the registration problem to be presented with countless particularities. The signals being registered can have an arbitrary amount of dimensions; they might not even contain the same representation of the structure being mapped in the first place. The data from an MRI (representing soft tissues) can be registered with CT scans (more sensitive to high density objects); one may also attempt to register satellite imagery with cartographic maps.
The objective of this article is to gently introduce how the registration problem can be addressed with a non-linear optimization framework, from the Taylor series until the formulation of the Homography Jacobian and Hessian. In this article, we will focus on rigid registration, although the concepts herein can also be applied to the non-rigid case.
gdb-imagewatch is a tool that can be used in conjunction with GDB to visualize in-memory buffers during debug sessions. In the most typical scenario, it’s used to visualize the contents of an OpenCV image being processed by the debugged application. To help with the debugging task, it has many features inspired by the Visual Studio ImageWatch plugin, such as pixel value visualization, support for multiple buffer types and dumping buffers to files; while also containing other helpful features such as buffer rotation.
Since the gdb-imagewatch plugin was created with extensibility in mind, it is very easy to adapt it so that it can support as many buffer types as needed by the programmer. In this post, I will show how to adapt the plugin so that it can display the contents of a custom C++ Image class.
In this post, I will talk about how I implemented a 2D omnidirectional shadow mapping algorithm (that is, given a 2D world with obstacles defined by polygon and a set of point lights, render this world by taking into account the shadows produced by the interaction between the lights and the obstacles). You can check out a live demo here (WebGL support is required to run it).
Although I’ve already implemented 3D shadow maps for spot lights in the past, the present case opens up space for exciting new ways of approaching this problem. As far as I know, Three.Js’ shadow implementations are specifically designed for 3D scenes. Moreover, I haven’t seen any references to omnidirectional shadow maps in this library (which doesn’t mean it’s not there; it might just not be documented). Therefore, instead of using the default implementation of shadow maps, I decided to code my own and explore the new possibilities I had in mind, including the use of a 1D depth map.
Lunch in Grid City, or UVA 855, is a problem created for the Portuguese regional of the ACM-ICPC. The premise of the problem is that you are given a set of coordinates in a grid, and have to find a coordinate that minimizes the sum of the Manhattan distances between all points in and . Formally:
A naive approach would be to compute the distances of all points in to all possible points in the grid, and pick up the point with smaller score. This has complexity , where s and a are the height and width of the grid, respectively. Since , this approach would obviously not work.
When I first studied the principles behind skeletal animations, I decided to implement it by interpolating between poses on CPU and by having the skinning on the vertex shader for performance reasons. I was thrilled to see some computational weight being transferred from CPU to GPU, where it could be handled with heavy parallelization.
As performance was my main concern, I also tried to squeeze the most of my CPU in the pose interpolation process by avoiding conditional logic (which, in turn, avoids branch mispredictions), interpolating quaternions with LERP instead of SLERP (the animation difference is not noticeable if you have a large amount of samples), normalizing vectors with the invsqrt procedure and using function pointers to discriminate between bones that do require interpolation from the ones that do not — which, eventually, I realized was not such a good idea: although a procedure call may be decoded in the fetch pipeline stage, an indirect function call depends on an address that might not be available on cache, which could cause the pipeline to stall for a (very slow) memory fetch.
When I wrote my first implementation, I found out that there was a myriad of possible interpolation methods, which would have been chosen by the designer in the modelling software: linear, constant, bezier curves and so on. It didn’t sound like a good idea to implement all those methods, as this would clutter my code with features that could fit very well in a large game engine, but not in my time constrained demo. Also, implementing some of these techniques would defeat my purpose of having a blazing fast skeletal animation demo.