Bioinformatics. Группа авторов. Читать онлайн. Newlib. NEWLIB.NET

Информация о произведении:

Автор:	Группа авторов
Издательство:	John Wiley & Sons Limited
Серия:
Жанр произведения:	Биология
Год издания:	0
isbn:	9781119335955

Скачать книгу

Several studies have attempted to answer the “which method is better” question by performing systematic analyses with test datasets (Pearson 1995; Agarawal and States 1998; Chen 2003). In one such study, Brenner et al. (1998) performed tests using a dataset derived from already known homologies documented in the Structural Classification of Proteins database (SCOP; Chapter 12). They found that FASTA performed better than BLAST in finding relationships between proteins having >30% sequence identity, and that the performance of all methods declines below 30%. Importantly, while the statistical values reported by BLAST slightly underestimated the true extent of errors when looking for known relationships, they found that BLAST and FASTA (with ktup = 2) were both able to detect most known relationships, calling them both “appropriate for rapid initial searches.”

Summary

The ability to perform pairwise sequence alignments and interpret the results from such analyses has become commonplace for nearly all biologists, no longer being a technique employed solely by bioinformaticians. With time, these methods have undergone a continual evolution, keeping pace with the types and scale of data that are being generated both in individual laboratories and by systematic, organismal sequencing projects. As with all computational techniques, the reader should have a firm grasp of the underlying algorithm, always keeping in mind the algorithm's capabilities and limitations. Intelligent use of the tools presented in this chapter can lead to powerful and interesting biological discoveries, but there have also been many cases documented where improper use of the tools has led to incorrect biological conclusions. By understanding the methods, users can optimally use them and end up with a better set of results than if these methods were treated simply as a “black box.” As biology is increasingly undertaken in a sequence-based fashion, using sequence data to underpin the design and interpretation of experiments, it becomes increasingly important that computational results, such as those generated using BLAST and FASTA, are cross-checked in the laboratory, against the literature, and with additional computational analyses to ensure that any conclusions drawn not only make biological sense but also are actually correct.

Internet Resources

BLAST
European Bioinformatics Institute (EBI)	www.ebi.ac.uk/blastall
National Center for Biotechnology Information (NCBI)	blast.ncbi.nlm.nih.gov
BLAST-Like Alignment Tool (BLAT)	genome.ucsc.edu/cgi-bin/hgBlat
NCBI Conserved Domain Database (CDD)	ncbi.nlm.nih.gov/cdd
Cancer Genome Anatomy Project (CGAP)	ocg.cancer.gov/programs/cgap
FASTA
EBI	www.ebi.ac.uk/Tools/sss/fasta
University of Virginia	fasta.bioch.virginia.edu
RefSeq	ncbi.nlm.nih.gov/refseq
Structural Classification of Proteins (SCOP)	scop.berkeley.edu
Swiss-Prot	www.uniprot.org

References

1 Agarawal, P. and States, D.J. (1998). Comparative accuracy of methods for protein similarity search. Bioinformatics. 14: 40–47.

2 Altschul, S.F. (1991). Amino acid substitution matrices from an information theoretic perspective. J. Mol. Biol. 219: 555–565.

3 Altschul, S.F. and Koonin, E.V. (1998). Iterated profile searches with PSI-BLAST: a tool for discovery in protein databases. Trends Biochem. Sci. 23: 444–447.

4 Altschul, S.F., Gish, W., Miller, W. et al. (1991). Basic local alignment search tool. J. Mol. Biol. 215: 403–410.

5 Altschul, S.F., Madden, T.L., Schäffer, A.A. et al. (1997). Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 25: 3389–3402.

6 Brenner, S.E., Chothia, C., and Hubbard, T.J.P. (1998). Assessing sequence comparison methods with reliable structurally identified evolutionary relationships. Proc. Natl. Acad. Sci. USA. 95: 6073–6078.

7 Bücher, P., Karplus, K., Moeri, N., and Hofmann, K. (1996). A flexible motif search technique based on generalized profiles. Comput. Chem. 20: 3–23.

8 Chen, Z. (2003). Assessing sequence comparison methods with the average precision criterion. Bioinformatics. 19: 2456–2460.

9 Dayhoff, M.O., Schwartz, R.M., and Orcutt, B.C. (1978). A model of evolutionary change in proteins. In: Atlas of Protein Sequence and Structure, vol. 5 (ed. M.O. Dayhoff), 345–352. Washington, DC: National Biomedical Research Foundation.

10 Doolittle, R.F. (1981). Similar amino acid sequences: chance or common ancestry. Science 214: 149–159.

11 Doolittle, R.F. (1989). Similar amino acid sequences revisited. Trends Biochem. Sci. 14: 244–245.

12 Gonnet,

Скачать книгу

Bioinformatics. Группа авторов

Summary

Internet Resources

Further Reading

References