RMSD algorithm

By malcolm McLean Homepage
This is code to calculate the root mean squared difference between two meshes / proteins.

Once the meshes are aligned the problem is trivial. Simply calculate the squared distance between each point, sum, divide by the number of vertices, and take the square root. However there are two problems. The first, which I do not deal with here, is producing an alignment if two proteins do not have identical sequences. Each point must be paired with exactly one in the reference structure if we are to calculate RMSD. The second problem is to find the matrix that will superimpose two meshes with the most accuracy.

The method was developed by Kabsch in 1976. His paper is rather difficult to follow and I have based the code on a simpler explanation by Lin (2004).

The algorithm works like this:
A rigid body transformation consists of a rotation matrix followed by a translation. The translation is easy. We simply take the centroids of the two meshes, and translate both so that the centroid is at the origin.

The centroid is defined as mean x, mean y, mean z.

The tricky bit if defining the rotation matrix. After translating both our meshes to the centroid, we construct a temporary matrix A.

Aij = Σ V1ki * V2kj
i, j 1 to 3 (the axes)
k 1 to N (the points)

We are stepping through both arrays of points, and multiplying the components together, summing to form the matrix A.

Now the formula is

(AtA)1/2 A-1

Or in words, multiply the transpose of A by A, take the square root, and multiply by the inverse of A.

A word of warning. If the points lie in a plane, which is likely for a trivial test set, the matrix will not have an inverse, and so the procedure breaks down. This can be fixed by adding a dummy point by taking the cross-product of three random points. You may also need special code for the cases of one point, two point, and three point (must be planar) meshes.

To calculate the square root of a matrix we use Denman-Beavers iteration.

Y0 = matrix
Z0 = identity
do
{
Yi+1 = 0.5 * Yi + Zi-1
Zi+1 = 0.5 * Zi + Yi-1
}
while Yi2 != matrix

The matrix inverse is calculated in the complicated way, using decomposition. Simple inversion by taking sub-matrices and determinants failed, because for some reason the algorithm seems to produce matrices with determiants very clse to zero.

To use the function, call


   /*
     calculate rmsd between two structures
     Params: v1 - first set of points
             v2 - second set of points
             N - number of points
             mtx - return for transform matrix used to align structures
      Returns: rmsd score
      Notes: mtx can be null. Transform will be rigid. Inputs must
             be previously aligned for sequence alignment
    */
    double rmsd(float *v1, float *v2, int N, float *mtx)
   

Source rmsd.c

References

Kabsch, Wolfgang, (1976) A solution of the best rotation to relate two sets of vectors Acta Crystallographica 32:922
Lin Ying-Hung, Chang Hsun-Chang, Lin Yaw-Ling (2004) A Study on Tools and Algorithms for 3-D Protein Structures Alignment and Comparison International Computer Symposium, Dec 15-17, Taipei, Taiwan
Last modified: Tue Jul 3 15:02:31 BST 2007