<?xml version="1.0" encoding="UTF-8"?><xml><records><record><source-app name="Biblio" version="6.x">Drupal-Biblio</source-app><ref-type>47</ref-type><contributors><authors><author><style face="normal" font="default" size="100%">Moll, M.</style></author><author><style face="normal" font="default" size="100%">Lydia E. Kavraki</style></author></authors></contributors><titles><title><style face="normal" font="default" size="100%">Matching of Structural Motifs Using Hashing on Residue Labels and Geometric Filtering for Protein Function	Prediction</style></title><secondary-title><style face="normal" font="default" size="100%">The Seventh Annual International Conference on Computational Systems Bioinformatics (CSB2008)</style></secondary-title></titles><keywords><keyword><style  face="normal" font="default" size="100%">functional annotation of proteins</style></keyword><keyword><style  face="normal" font="default" size="100%">kavrakilab</style></keyword><keyword><style  face="normal" font="default" size="100%">protein function</style></keyword></keywords><dates><year><style  face="normal" font="default" size="100%">2008</style></year></dates><urls><web-urls><url><style face="normal" font="default" size="100%">http://csb2008.org/csb2008papers/077Moll.pdf</style></url></web-urls></urls><pages><style face="normal" font="default" size="100%">157-168</style></pages><abstract><style face="normal" font="default" size="100%">There is an increasing number of proteins with known structure but unknown function. Determining their function	 would have a significant impact on understanding diseases	 and designing new therapeutics. However, experimental protein function determination is expensive and very	 time-consuming. Computational methods can facilitate	 function determination by identifying proteins that have	 high structural and chemical similarity. Our focus is on	 methods that determine binding site similarity. Although	 several such methods exist, it still remains a challenging	 problem to quickly find all functionally-related matches	 for structural motifs in large data sets with high	 specificity. In this context, a structural motif is a set	 of 3D points annotated with physicochemical information	 that characterize a molecular function. We propose a new	 method called LabelHash that creates hash tables of	 $n$-tuples of residues for a set of targets. Using these	 hash tables, we can quickly look up partial matches to a	 motif and expand those matches to complete matches. We show	 that by applying only very mild geometric constraints we	 can find statistically significant matches with extremely	 high specificity in very large data sets and for very	 general structural motifs. We demonstrate that our method	 requires a reasonable amount of storage when employing a	 simple geometric filter and further improves on the	 specificity of our previous work while maintaining very	 high sensitivity. Our algorithm is evaluated on 20 homolog	 classes and a non-redundant version of the Protein Data	 Bank as our background data set. We use cluster analysis to	 analyze why certain classes of homologs are more difficult	 to classify than others. The LabelHash algorithm is implemented on a web server at	 http://kavrakilab.org/labelhash/.</style></abstract></record></records></xml>