abstract |
(57) [Summary] (With revision) [Problem] To develop a method for efficiently extracting GPCR sequences from human genome sequences, and thereby to identify exhaustively new GPCRs. An automated system for discovering GPCR sequences has been independently developed. This system consists of the following three stages. The first step is the prediction of genetic bonds, ie translation from genome sequence to amino acid sequence. The second stage consists of a triple analysis of the amino acid sequence. That is, sequence searches against known GPCR databases, motif and domain assignments, and transmembrane helix predictions. Candidate sequences are screened by taking the union of the results of each of the three analyses. In the screening stage, union was used to maximize the number of candidate sequences. The third step is to further refine the quality of the gene candidate by eliminating overlapping sequences and fusing fragment sequences separated by misprediction. |