This work shows how to decrease the complexity of modeling flexibility in
proteins by reducing the number of dimensions necessary to model important
macromolecular motions such as the induced fit process. Induced fit occurs during the
binding of a protein to other proteins, nucleic acids or small molecules (ligands) and is
a critical part of protein function. It is now widely accepted that conformational
changes of proteins can affect their ability to bind other molecules and that any progress
in modeling protein motion and flexibility will contribute to the understanding of key
biological functions. However, modeling protein flexibility has proven a very difficult
task. Experimental laboratory methods such as X-ray crystallography produce rather
limited information, while computational methods such as molecular dynamics are too
slow for routine use with large systems. In this work we show how to use the Principal
Component Analysis method, a dimensionality reduction technique, to transform the
original high-dimensional representation of protein motion into a lower dimensional
representation that captures the dominant modes of motions of proteins. For a medium-
sized protein this corresponds to reducing a problem with a few thousand degrees of
freedom to one with less than fifty. Although there is inevitably some loss in accuracy,
we show that we can approximate conformations that have been observed in laboratory
experiments, starting from different initial conformations and working in a drastically
reduced search space. As shown in this work, the accuracy of protein approximations
using this method is similar to the tolerance of current rigid protein docking programs
to structural variations in receptor models.