2015, Número 3
<< Anterior
Rev Mex Ing Biomed 2015; 36 (3)
De la Secuenciación a la Aceleración Hardware de los Programas de Alineación de ADN, una Revisión Integral
Pacheco BD, González PM, Algredo BI
Idioma: Español
Referencias bibliográficas: 43
Paginas: 257-275
Archivo PDF: 955.41 Kb.
RESUMEN
En los últimos años ha ocurrido un avance impresionante en las máquinas de secuenciación paralela masiva, también
llamadas de secuenciación de siguiente generación (NGS), por ejemplo, máquinas recientes como Illumina Hiseq
son capaces de generar millones de lecturas en una sola corrida. No obstante, estas tecnologías están limitadas a
secuenciar solo fragmentos pequeños de material genético (entre 35 y 1100 nucleótidos), por lo que para secuenciar un
genoma completo es necesario dividir la cadena, secuenciar y posteriormente ensamblar las lecturas cortas obtenidas.
En este trabajo se revisan y comparan las tecnologías de secuenciación recientes, se estudia el proceso de ensamble
de genomas completos y se establece formalmente el problema de la alineación. También se incluye un resumen de
los principales programas de alineación y sus algoritmos que lo soportan. Finalmente, después de concluir que las
tecnologías de secuenciación han superado en velocidad por un factor mayor a 10x a los programas de alineación,
se revisa la aceleración Hardware como alternativa para acelerar tales programas. Este trabajo al ser una revisión
integral pretende contribuir al desarrollo de investigación en el área de bioinformática en el país.
REFERENCIAS (EN ESTE ARTÍCULO)
Frese, K.S., Katus, H.A. and Meder, B. “Next-Generation Sequencing: From understanding biology to personalized medicine”. Biology, Vol. 2, pp. 378-398, 2013.
Sanger, F., Nicklen, S. and Coulson, A.R. “DNA sequencing with chainterminating inhibitors”. PNAS, Vol. 74, No. 12, pp. 5463-5467, 1977.
Maxam, A. and Gilbert, A. “A new method for sequencing DNA”. PNAS, Vol. 74, No. 2, pp. 560-564, 1977.
Venter, C., et al. “The sequence of the human genome”. Science, Vol. 291, pp. 1304-1351, 2001.
Liu, L., et al. “Comparison of Next generation sequencing systems”. Journal of Biomedicine and Biotechnology, pp. 1-11, 2012.
Myllykangas, S., Buenrostro, J. and Ji, H.P. “Overview of sequencing technology platforms”. [book auth.] Naiara Rodríguez Ezpeleta, Michael Hackenberg and Ana M. Aransay. Bioinformatics for high throughput sequencing. s.l. : Springer, 2012, pp. 11-25.
Quail, M.A., et al. “A tale of three next generation sequencing platforms: comparison of Ion Torrent, Pacific Biosciences and Illumina MiSeq sequencers”. BMC Genomics, Vol. 13. No. 341, 2012.
Pop, M. “Shotgun sequence assembly”. Advances in Computers, Vol. 60, pp. 193-248, 2004.
Kim, R.Y., et al. “The future of personalized care in colorectal cancer”. Personalized Medicine, Vol. 8, No. 3, pp. 331-345, 2011.
Li, Z., et al. “Comparison of the two major classes of assembly algorithms”. Briefings in Functional Genomics, Vol. 11, No. 1, pp. 25-37, 2012.
Miller, J.R., Koren, S. and Sutton, G. “Assembly algorithms for nextgeneration sequencing data”. Genomics, vol. 95, no. 6, pp. 315-327, 2010.
Altschul, S„ et al. “Basic local alignment search tool”. Journal of Molecular Biology, Vol. 215, No. 3, pp. 403-410, 1990.
Muse, S. Genomics and bioinformatics. [book auth.] John D. Enderle, Susan M. Blanchard and Joseph D. Bronzino. Introduction to Biomedical Engineering. 2. s.l. : Elsevier, 2005, pp. 799-831.
Fonseca, N.A, et al. “Tools for mapping high-throughput sequencing data”. Bioinformatics, Vol. 28, No. 24, pp. 3169-3177, 2012.
Shang, J., et al. “Evaluation and comparison of multiple aligners for nextgeneration sequencing data analysis”. BioMed Research International, Vol. 2014.
Ruffalo, M., LaFramboise, T. and Koyutürk, M. “Comparative analysis of algorithms for nextgeneration sequencing read alignment”. Bioinformatics, Vol. 27, No. 20, pp. 2790-2796, 2011.
Li, H. and Homer, N. “A survey of sequence alignment algorithms for nextgeneration sequencing”. Briefings in Biointormatics, Vol. 2, No. 5, pp. 473- 483, 2010.
Li, H., Ruan, J. and Durbin, R. “Mapping short DNA sequencing reads and calling variants using mapping quality scores”. Genome Research, Vol. 18, pp. 1851-1858, 2008.
Campagna, D., et al. “PASS: a program to align short sequences”. Bioinformatics, Vol. 25, No. 7, pp. 967- 968, 2009.
Ning, Z., Cox, A. and Mullikin, J. “SSAHA: A fast search method for large DNA databases”. Genome Research, Vol. 11, No. 10, pp. 1725-1729, 2001.
Li, R., et al. “SOAP: short oligonucleotide alignment program”. Bioinformatics, Vol. 24, No. 5, pp. 713-714, 2008.
Smith, A.D., Xuan, Z. and Zhang, M.Q. “Using quality scores and longer reads improves accuracy af Solexa read mapping”. BMC Bioinformatics, Vol. 9, No. 128, 2008.
Jiang, H. and Wong, W. “SeqMap: mapping massive amount of oligonucleotides to the genome”. Bioinformatics, Vol. 24, No. 20, p. 2395, 2008.
Lin, H., et al. “Zoom! Zillions of oligos mapped”. Bioinformatics, Vol. 24, No. 21, pp. 2431-2437, 2008.
Rizk, G. and Lavenier, D. “GASSST: Global alignment short sequence search tool”. Bioinformatics, Vol. 26, No. 20, pp. 2534-2540, 2010.
Homer, N., Merriman, B. and Nelson, S. “BFAST: an alignment tool for large scale genome resequencing”. PLoS ONE, Vol. 4, 2009.
David, M., et al. “SHRiMP2”, Bioinformatics, Vol. 27, No. 7, pp. 1011-1012, 2011.
Smith, T.F. and Waterman, M. S. “Identification of common molecular subsequences”. Journal of Molecular Biology, Vol. 147, No. 1, pp. 195-197, 1981.
Needleman, S.B. and Wunsch, C.D. “A general method applicable to the search for similarities in the aminoacid sequence of two proteins”. Journal of Molecular Biology, Vol. 48, No. 3, pp. 443-453, 1970.
Burrows, M. and Wheeler, D.J. “A block sorting lossless data compression algorithm”. Systems Research Center, Digital Equipment Corporation. Palo Alto, California : s.n., 1994. Reporte Técnico. 124.
Ferragina, P. and Manzini, G. “Opportunistic data structures with applications”. Redondo Beach, CA: IEEE, 2000. Foundations of computer science. pp. 390-398.
Li, R., et al., et al. “SOAP2: an improved ultrafast tool for short read alignment”. Bioinformatics, Vol. 25, No. 15, pp. 1966-1967, 2009.
Langmead, B., et al. “Ultrafast and memory-efficient alignment of short DNA sequences to the human genome”. Genome Biology, Vol. 10, No. 3, 2009.
Li, H. and Durbin, R. “Fast and accurate short read alignment with Burrows-Wheeler transform”. Bioinformatics, Vol. 25, No. 14, pp. 1754-1760, 2009.
Navarro, G., et al. “Indexing methods for aproximate string matching”. IEEE Data Engineering Bulleting, Vol. 24, No. 4, pp. 19-27, 2001.
Schbath, S., et al. “Mapping reads on a genomic sequence: an practical comparative analysis”. Statistics for systems biology group. Paris, Francia : s.n., 2011. Reporte técnico. 34.
Che, S., et al. “Accelerating computeintensive applications with GPUs and FPGAs”. Anahem, CA : IEEE, 2008. Application specific processors, SASP2008. pp. 101-107.
Liu, C.M., et al. “SOAP3: Ultra-fast GPU-based parallel alignment tool for short reads”. Bioinformatics Advance Access Published, Vol. 28, No. 6, pp. 878-879, 2012.
Liu, Y., Schmidt, B. and Maskell, D.L. “Cushaw: a cuda compatible short read aligner to large genomes based on the burrows-wheeler transform”. BMC Research Notes, Vol. 5, No. 1, p. 27, 2012.
Nelson, C., et al. “Shepard: A fast exact match short read aligner”. Formal Methods and Models for Codesign (MEMOCODE), 2012 10th IEEE/ACM International Conference on. pp. 91-94.
Fernandez, E., Najjar, W. and Lonardi, S. “String matching in hardware using the FM-Index”. Salt, Lake City, UT: IEEE, 2011. IEEE International Symposium on Field-Programmable Custom Computing Machines. pp. 218- 225.
Arram, J., et al. “Reconfigurable acceleration of short read mapping”. Seattle, WA : IEEE, 2013. 21st Annual International IEEE Symposium on Field-Programmable Custom Computing Machines. pp. 210-217.
Waidyasooriya, H.M., Hariyama, M. and Kameyama, M. “Implementation of a custom harwdware-accelerator for short-read mapping using Burrows- Wheeler Alignment”. Osaka, Japan: IEEE, 2013. 35th Annual International Conference of the IEEE EMBS. pp. 651- 654.