abstract |
A method and system are disclosed for identifying and/or locating complex patterns in an amino acid sequence stored in a computer file or database. According to an aspect of the present invention, techniques are provided to facilitate queries of protein databases. For protein descriptions received in response to the queries, embodiments of the present invention may scan the received protein descriptions to identify and locate Replikin patterns. A Replikin pattern is defined to be a sequence of 7 to about 50 amino acids that include the following three (3) characteristics, each of which may be recognized by an embodiment of the present invention: (1) the sequence has at least one lysine residue located six to ten amino acid residues from a second lysine residue; (2) the sequence has at least one histidine residue; and (3) at least 6% of the amino acids in the sequence are lysine residues. |