
		HCA_analyze and HCA_analyze_multiple_aligns

			(c) 2007 Pedro J. Silva





HCA_analyze and HCA_analyze_multiple_aligns are used in a very similar way:

SYNTAX:		HCA_analyze input_file.pir output_file.txt
SYNTAX:		HCA_analyze_multiple_aligns input_file.pir output_file.txt


The programs must be run from the DOS command line (invoked in Windows from the Start Menu:

 Windows>Start>Execute> cmd


or with a .bat file. This means that the progrmas cannot be activated through double-clicking! Sample .bat files are present in the distributions. 

Input files must be .pir format alignments (obtained e.g. with ClustalW). Both programs output a tab separated list (ready for input in e.g. Excel). The columns are labelled on the first line, and the labels should be self-explanatory. HCA_multiple_aligns also creates two additional output files:
-	"Distinct_HCA_patterns.txt" lists all sequences with less than 60% HCA similarity (relative to each other)
-	"minima.txt" includes the characteristics of the most divergent sequences (based on HCA score, charged aminoacid similarity and proline distribution).


The hca.txt matrix file should not be modified, unless you want to specify a different alignment scoring scheme. That file is supposed to be used by ClustalW (instead of its built-in PAM or BLOSUM matrixes) in order to obtain an automatic alignment optimized for hydrophobic cluster similarity. With that matrix ClustalW automatically computes a good alignment of hydrophobic aminoacids. It is a way to have a good, unbiased, starting alignment, that we can manually adjust afterwards. The alignment may be generated by the special hca.txt matrix, but that is not necessary for the computation of the score by HCA_analyze  to be valid: it is only a (very) convenient way of generating a reasonable hydrophobic alignment between your sequences. 


