Implementing a GJK Intersection Query
So, a while ago, I needed to write some intersection queries, and a bit of research naturally led me to a GJK-based solution. Further research led me to Casey's excellent explanation of the algorithm (go watch it, it's quite good), along with some interesting insight on how to implement it simply and efficiently. Unfortunately, some of the diagrams in the posts are missing, and it's still a little daunting for a newbie to jump in and make sense of it. Hopefully, I can fix that.
Now, this isn't the "how GJK works" page. That page is over that way. This page is all about the actual implementation, including a number of high-level simplifications which naturally fall out of the implementation.
Also note that this is an intersection query. It's still GJK, but it's somewhat simplified since all we care about is whether the shapes intersect, not precisely where. If you want where, you want a variation of this algorithm which I'm not covering, as I haven't got the code for it (never needed it, never written it).
And I'm only going to deal with 3D, because that's the most useful case. It's fairly trivial to adapt the algorithm to 2D, and I'll make a note of how to do that. Don't ask me about 4-or-more-D, I'm taking a geometric approach to this, and I can't visualize that many dimensions.
Storage
GJK is an iterative algorithm, so we need to track some data from one iteration to the next. I'm going to roughly follow Casey's naming convention, with one small change for clarity. The newest point to be added will always be called \(a\). Points \(b\), \(c\), and \(d\) refer to previously considered and retained points. I'm going to call our search vector \(\vec{v}\) here (Casey calls it \(D\), but that could be confused with the point \(d\)). The last thing is a flag telling us which of the points are valid: \(n\) is the number of points retained in the simplex from the previous step (so if \(n\) is $2$, then \(b\) and \(c\) make up the simplex and \(d\) is unused).
I'm going to wrap the state of the GJK solver in a simple object, which gives the following prototype:
class gjk
{
public:
template< typename SupportMapping >
bool intersect( SupportMapping support );
template< typename SupportMappingA, typename SupportMappingB >
bool intersect( SupportMappingA support_a, SupportMappingB support_b );
private:
vec3 v;
vec3 b, c, d;
unsigned int n;
bool update( const vec3 &a );
};
The first of the intersect
overloads is the one that's going to do all of the work. It takes the composite support mapping (which I called \(S_G(\vec{v})\) in the high-level overview). The second just wraps its arguments into a single support mapping and calls the first.
The intended usage is something along the lines of the following:
shape s1, s2;
frust f = camera.frustum();
gjk g;
if( g.intersect( f.support_mapping(), s1.support_mapping() )
//render s1
;
if( g.intersect( f.support_mapping(), s2.support_mapping() )
//render s2
;
//so on
The Outer Loop
As Casey points out, a lot of important information is encoded in the process of getting our next point. One of the most important bits, which he doesn't stress in his video (it's sort of brought up on the forums) is right here.
So let's take a first stab at GJK's main loop. The algorithm is:
- Start with an empty simplex and an arbitrary search direction \(\vec{v}\).
- Loop:
- Get a new point \(a = S_G(\vec{v})\).
- See if \(a\) is further along \(\vec{v}\) than the origin. Exit returning "no intersection" if it isn't.
- Add \(a\) to the simplex.
- If the simplex now encloses the origin, return "found intersection".
- Find the feature (\(F\)) of the simplex closest to the origin.
- Remove all points from the simplex which are not part of \(F\).
- Based on \(F\), compute a new \(\vec{v}\) which is both perpendicular to \(F\) and oriented towards the origin with respect to \(F\).
Updating the simplex is fairly involved, since there are several cases to consider, so we'll push that part of the loop into another function (update
). The rest comes out to something like the following:
template< typename SupportMapping >
bool gjk::intersect( SupportMapping support )
{
v = vec3c( 1, 0, 0 ); //some arbitrary starting vector
n = 0; //flag our simplex as "empty"
for( ; ; )
{
vec3 a = support( v );
if( dot( a, v ) < 0 )
//no intersection
return false;
if( update( a ) )
//adding the new point resulted in a
//simplex which encloses the origin
return true;
}
}
See it? It's the dot product. If update
is called, it will only be with a new point \(a\) such that its dot with \(\vec{v}\) is positive (well, OK, points aren't vectors, so that's actually \(a-0\)). That little fact is going to keep popping up and it's going to make our lives a lot easier when it's time to write update
.
Updating the Simplex
OK. It's time to write update
.
Its job is to look at the union of the existing simplex (encoded by \(b\), \(c\), \(d\), and \(n\)) and the new point (\(a\)), and then decide what to keep and where to look for the next \(a\). How it does this differs based on the size of the new simplex, so let's look at each case individually.
When \(n=0\):
This is the most trivial case. The simplex was empty in the previous iteration. We add whatever point we got and update \(\vec{v}\) to point from it directly at the origin:
b = a;
v = -a;
n = 1;
//can't enclose anything with just one point
return false;
When \(n=1\):
So, in this case, \(n\) indicated that we had one point left from the last iteration. The addition of the new point \(a\) gives us two, which is a line segment.
void setup()
{
size( 500, 200 );
}
Point a = new Point( 400, 150 ); a.label = "a";
Point b = new Point( 100, 100 ); b.label = "b";
Point o = new Point( 300, 50 ); o.label = "0";
mouseable.push( a, b, o );
Halfspace no1 = new Halfspace( b, a, #808000 );
Halfspace no2 = new Halfspace( a, b, #008080 );
void draw()
{
background( #000000 );
stroke( #404040 );
stroke( #FFFFFF );
ptline( a, b );
stroke( #00FF00 );
ptline( b, o );
if( (a.X-o.X) \* (o.X-b.X) + (a.Y-o.Y) \* (o.Y-b.Y) < 0 )
stroke( #FF0000 );
else
stroke( #0080FF );
ptline( a, o );
ptxline( o, a, 25 );
no1.draw();
no2.draw();
a.draw();
b.draw();
o.draw();
}
So here we've got our old point \(B\), our new point \(A\), and the origin $0$. Our last search direction (the one that yielded \(A\)) is the vector from \(B\) to $0$, and is parallel to the line drawn in green. The other colored line represents the vector from the origin to \(A\), and extends back a little ways past $0$ for reasons which will soon be clear. This line is drawn in blue if this is a possible configuration (that is, if the check in the outer loop would have passed for this configuration), and otherwise in red.
Note the two shaded regions. These denote volumes (remember, this is an edge-on view of a 3D space - the teal and yellow lines represent planes) where we will not find the origin. But why not?
Let's start with the yellow region. To analyze this, we need to go back one iteration. Adding a point gave us a total of two, which means that we previously had one. And when we have one point, our search vector is simply the vector from that point towards the origin (the green line in the diagram). If the origin were in the yellow region, then that means that our support mapping returned a point somewhere in the opposite direction of \(\vec{v}\), which a support mapping cannot, by its very definition, do.
So what about the teal region?
Well, what do we know? We have that green line, which represents \(\vec{v}\). It was computed in the prior step as \(\vec{v}=0-b\). And then we have the line between the origin and \(a\), rendering in either red or blue (depending on how you've dragged things), which I'll call \(\vec{a}\). Its value is \(\vec{a}=a-0\).
Now these vectors meet head to tail at $0$, so the angle between them is actually the angle between the green line and the small extension past $0$ of the blue-or-red line. The thing we know about that angle is that, because \(\vec{v}\cdot\vec{a}\ge0\) (straight from the dot product I highlighted in the outer loop), it's acute (or right). Since it's acute or right, its complementary angle (the one inside the triangle \(AB0\) at $0$) must be obtuse or right. So the other two angles in that triangle, particularly the one at \(a\), must be acute. This fact excludes the yellow region as well - I've included the other proof above since it's a little simpler and ties in to Casey's video.
Go ahead and drag the points around. If the line between \(a\) and $0$ is blue, then \(\vec{v}\cdot\vec{a}\ge0\). If it turns red then it means that the dot product check in the outer loop would have failed, and that the configuration is thus impossible at this stage in the algorithm.
That actually excludes more space than the teal region covers, but that isn't important. What is important is the fact that we've just proven that the origin will never lie in two of the three regions the GJK algorithm cares about. So if we have one point in the simplex, and we add another, there's nothing to check. The result is a simplex containing both points, and we can just update our variables and compute a new \(\vec{v}\).
v = cross_aba( b - a, -a );
c = b;
b = a; //silly, yes, we'll come back to this
n = 2;
//can't possibly contain the origin unless we've
//built a tetrahedron, so just return false
return false;
Where cross_aba
is a simple helper function defined as follows:
//return a vector perpendicular to a and
//parallel to (and in the direction of) b
inline vec3 cross_aba( const vec3 &a, const vec3 &b )
{
return cross( cross( a, b ), a );
}
When \(n=2\):
This case is a little more involved, but here goes:
void setup()
{
size( 500, 200 );
}
Point a = new Point( 400, 100 ); a.label = "a";
Point b = new Point( 100, 175 ); b.label = "b";
Point c = new Point( 100, 25 ); c.label = "c";
Point m0 = new Point( 0, 0 );
Point m1 = new Point( 0, 0 );
Point m2 = new Point( 0, 0 );
mouseable.push( a, b, c );
Halfspace no1 = new Halfspace( c, b, #808000 );
Halfspace no2 = new Halfspace( b, c, #008080 );
Halfspace no3 = new Halfspace( m0, m1, #800000 );
Halfspace no4 = new Halfspace( a, m2, #008000 );
void draw()
{
background( #000000 );
float vx = c.Y - b.Y;
float vy = b.X - c.X;
int w = winding( a, b, c );
if( w > 0 )
{
vx = -vx;
vy = -vy;
}
m0.X = (b.X + c.X) / 2;
m0.Y = (b.Y + c.Y) / 2;
m1.X = m0.X + vx;
m1.Y = m0.Y + vy;
m2.X = a.X - vx;
m2.Y = a.Y - vy;
no1.draw();
no2.draw();
no3.draw();
no4.draw();
stroke( #A0A0A0 );
perpline( a, b, -w );
perpline( a, c, w );
perpline( b, c, -w );
perpline( b, a, w );
perpline( c, a, -w );
perpline( c, b, w );
stroke( #FFFFFF );
ptline( a, b );
ptline( b, c );
ptline( c, a );
a.draw();
b.draw();
c.draw();
stroke( #404040 );
}
Alright. So, what have we got here? Well, a triangle is formed by adding a point (\(a\)) to a line segment (\(b\) and \(c\)). The teal and yellow regions carry over from the line segment case (the fact that we're on a new iteration doesn't change the geometric reality of the previous one).
The red region is an impossibility because our last \(\vec{v}\) points away from it (it has to point towards whatever side \(a\) ended up on, again by the definition of the support mapping), and we picked that vector to point towards the origin.
The green region is also excluded. The dot product check in the outer loop is basically asking, "Is \(a\) as far or farther than the origin along the vector \(\vec{v}\)?" If the origin were anywhere out beyond \(a\), then we would have returned from the loop before trying to update.
(A minor but important note: the colored regions are where we know that the origin will not be found. They don't restrict \(a\)'s location.)
Now compare those regions to the Voronoi regions of the points and edges that make up this triangle. Notice something? The regions nearest to the vertices fall entirely within the exclusion zones, as does the region nearest to the segment \(\overline{bc}\).
This leaves us with only three cases that we need to handle (though one of them - the inside of the triangle - is split into two sub-cases).
vec3 ao = -a; //silly, yes, clarity is important
//compute the vectors parallel to the edges we'll test
vec3 ab = b - a;
vec3 ac = c - a;
//compute the triangle's normal
vec3 abc = cross( ab, ac );
//compute a vector within the plane of the triangle,
//pointing away from the edge ab
vec3 abp = cross( ab, abc );
if( dot( abp, ao ) > 0 )
{
//the origin lies outside the triangle,
//near the edge ab
c = b;
b = a;
v = cross_aba( ab, ao );
return false;
}
//perform a similar test for the edge ac
vec3 acp = cross( abc, ac );
if( dot( acp, ao ) > 0 )
{
b = a;
v = cross_aba( ac, ao );
return false;
}
//if we get here, then the origin must be
//within the triangle, but we care whether
//it is above or below it, so test
if( dot( abc, ao ) > 0 )
{
d = c;
c = b;
b = a;
v = abc;
}
else
{
d = b;
b = a;
v = -abc;
}
n = 3;
//again, need a tetrahedron to enclose the origin
return false;
Note that the points are stored in a different order depending on which side of the triangle the origin is found on. This is so that when we find our next point, the triangle of old points will be wound such that the origin is "above" it, which simplifies things greatly.
When \(n=3\):
Yeah, I'm not going to diagram this one out. When I was learning, I actually ended up building a little model and then folding bits of paper and standing them up beside it in order to visualize everything. (It's almost, but thankfully not quite, too silly to work.)
Anyway, let's think about this.
We've got our incoming simplex, which is a triangle, with its vertices stored in \(b\), \(c\), and \(d\). Each of its edges has been tested during its construction, and we know that the origin is not in any edge's Voronoi region. That leaves us with an infinitely long triangular prism (aligned with the triangle's face normal - remember, those diagrams above were edge-on views of 3D configurations) in which the origin may be found.
But is it even infinitely long? Well, no. We were careful to store our triangle such that it was wound with respect to which side the origin falls on - the origin isn't "below" the triangle, so half of our infinitely tall prism is gone - cut off by the plane the base triangle sits in. Further, we have our new point a, which is "above" not only the triangle but also the origin (by the dot product in the outer loop - this is the same as how we got our green region in the previous case). This leaves us a proper triangular prism.
As was the case when we built the triangle, and for entirely similar reasons, \(a\)'s Voronoi region falls entirely within the excluded regions.
That leaves us with the following cases that we have to handle: three new edges, three new faces, and the volume within the tetrahedron.
The final (and most interesting, since it lets us terminate the algorithm) case is when the point is inside the tetrahedron. The only way we're going to find it there is if we find it behind all three of the new faces, so it's natural to test the three planes first. And if it's not behind the planes, then it's in front of one or two of them (anyone who's seen the old version of this page take note: I got this wrong before - thanks to Riv for pointing it out). When the point is in front of one face, we consider it and its edges. When it's in front of two face, we have to consider them both and all three edges.
vec3 ao = -a;
vec3 ab = b - a;
vec3 ac = c - a;
vec3 ad = d - a;
vec3 abc = cross( ab, ac );
vec3 acd = cross( ac, ad );
vec3 adb = cross( ad, ab );
vec3 tmp;
const int over_abc = 0x1;
const int over_acd = 0x2;
const int over_adb = 0x4;
int plane_tests =
(dot( abc, ao ) > 0 ? over_abc : 0) |
(dot( acd, ao ) > 0 ? over_acd : 0) |
(dot( adb, ao ) > 0 ? over_adb : 0);
switch( plane_tests )
{
case 0:
//behind all three faces, thus inside the tetrahedron - we're done
return true;
case over_abc:
goto check_one_face;
case over_acd:
//rotate ACD into ABC
b = c;
c = d;
ab = ac;
ac = ad;
abc = acd;
goto check_one_face;
case over_adb:
//rotate ADB into ABC
c = b;
b = d;
ac = ab;
ab = ad;
abc = adb;
goto check_one_face;
case over_abc | over_acd:
goto check_two_faces;
case over_acd | over_adb:
//rotate ACD, ADB into ABC, ACD
tmp = b;
b = c;
c = d;
d = tmp;
tmp = ab;
ab = ac;
ac = ad;
ad = tmp;
abc = acd;
acd = adb;
goto check_two_faces;
case over_adb | over_abc:
//rotate ADB, ABC into ABC, ACD
tmp = c;
c = b;
b = d;
d = tmp;
tmp = ac;
ac = ab;
ab = ad;
ad = tmp;
acd = abc;
abc = adb;
goto check_two_faces;
default:
//we've managed to build a horribly degenerate simplex
//this could just as easily be an assert, but since this
//code was originally used to reject definite non-intersections
//as an optimization it conservatively returns true
return true;
}
check_one_face:
if( dot( cross( abc, ac ), ao ) > 0 )
{
//in the region of AC
b = a;
v = cross_aba( ac, ao );
n = 2;
return false;
}
check_one_face_part_2:
if( dot( cross( ab, abc ), ao ) > 0 )
{
//in the region of edge AB
c = b;
b = a;
v = cross_aba( ab, ao );
n = 2;
return false;
}
//in the region of ABC
d = c;
c = b;
b = a;
v = abc;
n = 3;
return false;
check_two_faces:
if( dot( cross( abc, ac ), ao ) > 0 )
{
//the origin is beyond AC from ABC's
//perspective, effectively excluding
//ACD from consideration
//we thus need test only ACD
b = c;
c = d;
ab = ac;
ac = ad;
abc = acd;
goto check_one_face;
}
//at this point we know we're either over
//ABC or over AB - all that's left is the
//second half of the one-face test
goto check_one_face_part_2;
It's a big block of code, but look closely. All we're doing in the first part is testing the planes of the three new faces. These checks are simplified because we took care in the \(n=2\) case to make sure we stored our triangle with a consistent winding (no fiddling around figuring out which side of these plans is the "inside"). Based on these checks, we find ourselves in one of the following cases:
- 1 way for the point to be in the tetrahedron
- 3 ways for the point to be in front of a single face
- 3 ways for the point to be in front of two faces
While there are seven individual cases, there are only three kinds of cases. Each of the subcases within the latter two types is just a rotation of the others, so we pick one to solve, and if we detect a rotated version we just shuffle our points around, effectively rotating the case we have into the case we've solved. That simplifies things immensely.
And then things get even simpler. Once we've rotated into a standard case and jumped to the part of our function that handles that case, we quickly notice that a lot of the subcases are highly similar. For instance, if the point is in front of a single face, then we've basically got a triangle (our \(n=2\) case), except we know the origin won't be under the triangle. If it's in front of two faces, we check the shared edge first, with respect to one of the faces. If the origin is out past that edge, then it could be anywhere in the region of the other face or its edges - but that's just a single face now, so we can treat that like the previous case where the point is only in front of one face's plane. If that test doesn't exclude the face we're testing relative to, then again we're testing a single face, only we don't have to bother retesting that shared edge.
Robustness, Precision, Performance
This code is a boolean intersection query. It doesn't tell you where the convex objects intersect. Further, it's intended to be used as a rendering optimization (is this object in view? can I skip rendering that shadow map? etc.), so it isn't even a hundred percent precise (that's the code you've been seeing in particular that's not precise, not the logic of the algorithm itself). It's designed this way for performance reasons, so let's discuss exactly what these limitations are and how to deal with them (because if you run it as is on any nontrivial inputs it will eventually get stuck in an infinite loop).
First off, since we don't care about where the intersection is, we can avoid normalizing a lot of our intermediate values since we're only interested in doing a series of plane side tests. That said, depending on the scale of your world, cross_aba
can produce some monstrously large vectors. Your support mapping functions either need to deal with this, or you can normalize \(\vec{v}\) at the end of each update and not worry about it beyond that.
Second, these are floating point numbers we're working with. You can get into situations where rounding errors leave you in an infinite loop as your support mapping bounces back and forth between a small number of nearby points. You could write a bunch of code to detect this, maybe sprinkle in some epsilons (which are not just 0.00001), but in practice it's such a vanishingly rare case that - for the simplified motivating case we're working with here - it's easier to just cap the total number of iterations through the outer loop and conservatively declare that there's an intersection if you blow past that limit (which it almost certainly is if your support mapping is reasonably well behaved). Obviously this won't work if you're doing physics calculations, but that necessitates some additional work to extract contact points anyway, and the added work of getting this 100% right probably folds together nicely with that.
Finally, take a look back over the three cases up above. The \(n=0\) case always exits with \(n\) set to $1$, and the \(n=1\) case always exits with \(n\) equal to $2$. And no case after that ever sets \(n\) to anything less than $2$. So the \(n=1, 2\) cases can be collapsed into the setup code above the loop, leaving only two cases (\(n=2\), \(n=3\)) for the actual update
function to deal with.
Putting that all together and cleaning up a bit, we get:
template< typename SupportMapping >
bool gjk::intersect( SupportMapping support )
{
v = vec3c( 1, 0, 0 ); //some arbitrary starting vector
c = support( v );
if( dot( c, v ) < 0 )
return false;
v = -c;
b = support( v );
if( dot( b, v ) < 0 )
return false;
v = cross_aba( c - b, -b );
n = 2;
for( int iterations = 0; iterations < 32; iterations++ )
{
vec3 a = support( v );
if( dot( a, v ) < 0 )
return false;
if( update( a ) )
return true;
}
//out of iterations, be conservative
return true;
}
GJK in 2D
The algorithm is fairly trivial to reduce to 2D. For one thing, the \(n=3\) case is entirely gone, since a triangle is enough to enclose the origin in 2D. The cross
and cross_aba
calls need to be replaced with the appropriate "find me a vector perpendicular to x" formula, and the end of the \(n=2\) case (where we check if we're above/below the triangle) simply becomes return true
.