When you're programming games or other 3d applications using OpenGL or DirectX, it is often required to use affine transformation matrices to describe transformations in space (4x4 matrices and the like). There is not a lot of documentation about this online, and most implementations I've seen, have some sort of hack. In this tutorial, I'll describe what affine transformations are, and more importantly, how to invert them correctly and efficiently.
It is assumed that the reader knows what a matrices are and how to multiply them.
The affine transformation
Imagine you have a ball lying at (1,0) in your coordinate system. You want to move this ball to (0,2) by first rotating the ball 90 degrees to (0,1) and then moving it upwards with 1. This transformation is described by a rotation and translation. The rotation is:
$$
\left[\begin{array}{cc}
0 & -1\\
1 & 0\\
\end{array}\right]
$$
and the translation is (0,1).
To apply this transformation to a vector $\vec{x}$, we do:
$$\vec{x}^\prime = R \vec{x} + \vec{T}$$
where R is a rotation matrix, and T is a translation vector. This is called an
affine transformation.
If you're in 2d space, there is no 2x2 matrix that will do this transformation for all points. However, if we go one dimension higher, to a 3x3 matrix, you can! That's why OpenGL uses 4x4 matrices to describe 3d transformations, as we'll see later.
The matrix representation
The best way to explain how to make this matrix, is to give the matrix for the example above.
$$
\left[\begin{array}{ccc}
0 & -1 & 0 \\
1 & 0 & 1 \\
0 & 0 & 1 \\
\end{array}\right] $$
As you can see, the left upper block is the rotation matrix, and to the right going downwards we have our translation. Because it's a 3x3 matrix now, we have to apply it to a 3d vector. We take the vector we had, $(1,0)$, and set the third component to 1. See what happens:
$$ \left[\begin{array}{ccc} 0 & -1 & 0 \\ 1 & 0 & 1 \\ 0 & 0 & 1 \\ \end{array}\right]
\begin{pmatrix} 1\\0\\1 \end{pmatrix} = \begin{pmatrix} 0\\2\\1 \end{pmatrix} $$
This gives the result we wanted! If you look closer to the matrix multiplication, we see why this works. The trick is to set the last component of the vector to 1, so that the translation just gets added. In a short-hand notation the matrix and vector look like this:
$$ \left[\begin{array}{c|c} R & T \\ \hline 0 & 1 \\ \end{array}\right] \begin{pmatrix} x\\ \hline 1 \end{pmatrix}=\begin{pmatrix} x^\prime\\ \hline 1 \end{pmatrix}$$
Inverting an affine transformation matrix
Sometimes it is very imporant to invert an affine transformation, for example to transform back from world space to object space. A naive approach is to just write a function that inverts 3x3 or 4x4 matrices. This is very inefficient, because there are some nice properties we can use.
If we think about what happens when we apply the affine transformation matrix, we rotate first over an angle $\alpha$, and then translate over $(T_x, T_y)$. So the inverse should translate first with $(-T_x, -T_y)$, and then rotate over $-\alpha$. Unfortunately, that's not what happens. What does happen is that the rotation is always applied first. So we have to correct for that by modifying our translation. A derivation:
$$
\begin{array}
\vec{x}^\prime = R\vec{x} + T \\
\vec{x}^\prime - T = R\vec{x} \\
R^{-1}(\vec{x}^\prime - T) = \vec{x}\\
\end{array}
$$
So to get back the original vector $\vec{x}$, we have a new affine transformation:
$$ \vec{x} = R^{-1}\vec{x}^\prime - (R^{-1}T) $$
What we have to do now is calculate the inverse of a rotation matrix and using that result, calculate our new translation.
Let's recall how a general rotation matrix in 2d looks like:
$$ \left[\begin{array}{cc} \cos(\alpha) & - \sin(\alpha) \\ \sin(\alpha) & \cos(\alpha) \\ \end{array}\right] $$
Because a rotation matrix is unitary, the inverse of a rotation matrix is equal to its transpose, so inverting can be done very quickly:
$$ \left[\begin{array}{cc} \cos(\alpha) & \sin(\alpha) \\ -\sin(\alpha) & \cos(\alpha) \\ \end{array}\right] $$
Now all we have to do is apply this to T, to get all the components for our inverse matrix:
$$ \left[\begin{array}{c|c} R^{-1} & R^{-1}T \\ \hline 0 & 1 \\ \end{array}\right] \begin{pmatrix} x^\prime\\ \hline 1 \end{pmatrix}$$
Putting it together
As we've seen, general 2d affine transformation matrices look like
$$ \left[\begin{array}{ccc} \cos(\alpha) & - \sin(\alpha) & T_x \\ \sin(\alpha) & \cos(\alpha) & T_y \\ 0 & 0 & 1 \\ \end{array}\right] $$
Applying the strategy we've derived above, the inverse is:
$$ \left[\begin{array}{ccc} \cos(\alpha) & \sin(\alpha) & -T_x \cos(\alpha) - T_y \sin(\alpha) \\ -\sin(\alpha) & \cos(\alpha) & -T_y \cos(\alpha) + T_x \sin(\alpha) \\ 0 & 0 & 1 \\ \end{array}\right] $$
Expanding to 3d is trivial, since the same rules hold. For example, let's pick the rotation of $\theta$ over the axis $(0,1,0)$ and a translation $(1,2,3)$.
The matrix becomes:
$$ \left[\begin{array}{cccc} \cos(\theta) & 0 & \sin(\theta) &1 \\ 0 & 1 & 0 & 2\\ -\sin(\theta) & 0 & \cos(\theta) & 3 \\ 0 & 0 & 0 & 1 \\ \end{array}\right] $$
And the inverse is:
$$
\left[
\begin{array}{cccc}
\cos (\theta ) & 0 & -\sin (\theta ) & 3 \sin (\theta )-\cos (\theta ) \\
0 & 1 & 0 & -2 \\
\sin (\theta ) & 0 & \cos (\theta ) & -3 \cos (\theta )-\sin (\theta ) \\
0 & 0 & 0 & 1
\end{array}
\right]
$$
These 4x4 matrices are the ones that OpenGL expects in functions like
glMultMatrixf!
In order to use this knowledge in your code, you should write a matrix class that can 1)
create a rotation matrix from an angle and axis 2) transpose a matrix and 3) be applied to a vector.
Conclusion
Hopefully this tutorial has helped you better grasp the concepts of affine transformations. We've seen the definition of these transformations and how to use those to find a shortcut for a quick inversion algorithm. If you have any questions, please let me know!