SAD—a normalized structural alignment database: improving sequence–structure alignments
Marsden B., Abagyan R.
Abstract Motivation: We present a structural alignment database that is specifically targeted for use in derivation and optimization of sequence–structure alignment algorithms for homology modeling. We have paid attention to ensure that fold-space is properly sampled, that the structures involved in alignments are of significant resolution (better than 2.5 Å) and the alignments are accurate and reliable. Results: Alignments have been taken from the HOMSTRAD, BAliBASE and SCOP-based Gerstein databases along with alignments generated by a global structural alignment method described here. In order to discriminate between equivalent alignments from these different sources, we have developed a novel scoring function, Contact Alignment Quality score, which evaluates trial alignments by their statistical significance combined with their ability to reproduce conserved three-dimensional residue contacts. The resulting non-redundant, unbiased database contains 1927 alignments from across fold-space with high-resolution structures and a wide range of sequence identities. Availability: The database can be interactively queried either over the web at http://abagyan.scripps.edu/lab/web/sad/show.cgi or by using MySQL, and is also available to download over the web.