Hydrophobic cluster database


Author: Pedro J. Silva
Reference: Silva, P.J. "Assessing the reliability of sequence similarities detected through hydrophobic sequence analysis", Proteins: Structure, Function and Bioinformatics, 70, 1588-1594

Downloads and Supporting Information


  • Class a: All alpha proteins
  • Class b: All beta proteins
  • Class c: Alpha and beta proteins (a/b)
  • Class d: Alpha and beta proteins (a+b)
  • Class e: Multi-domain proteins (alpha and beta)
  • Class f: Membrane and cell surface proteins and peptides
  • Class g: Small proteins
  • Class h: Coiled coil proteins
  • Class i: Low resolution protein structures
  • Class j: Peptides
  • Class k: Designed proteins
  • SCOP folds with >5 compatible HCA patterns

    Synopsis

    Gaboriaud et al. have shown that sequences with very similar distribution patterns of the hydrophobic set of residues V, I, L, F, Y, W, M (detected in a two-dimensional helical representation of the protein sequence) are most often structural homologs, even when the overall sequence identity is as low as 7 %. This representation is obtained by writing the protein sequence on a classical alpha-helix (3.6 amino acids per turn) smoothed on a cylinder. After five turns, residues i and i+ 18 have similar positions parallel to the axis of the cylinder.To make this 3D representation easier to handle, the cylinder is then cut parallel to its axis and unrolled. As some adjacent amino acids are widely separated by the unfolding of the cylinder, the representation is duplicated, making the sequence easier to follow and giving a better impression of the environment of each aminoacid.

    Clusters of these hydrophobic amino acids are good markers of regular secondary structures and have been extensively used in the detection of similar folds or similar motifs between sequences showing very limited sequence relatedness (reviewed by Callebaut et al.).
    Using a new methodology, we have shown that whereas most structural folds of proteins, as defined in the SCOP classification of protein structure (release 1.65), are very homogeneous in hydrophobic cluster composition, a large number of the described folds are compatible with a large variety of hydrophobic patterns. We have gathered every distinct hydrophobic cluster pattern present in each fold of the SCOP database (release 1.65) in this HCA database. We hope that this information will be helpful in:
  • the design of synthetic proteins with strutural homology to any given fold
  • the recognition of protein folding cores
  • the identification of suitable templates for homology modelling of very divergent sequences


    © 2007 Pedro J. Silva