2012, Number 4
<< Back Next >>
Revista Cubana de Información en Ciencias de la Salud (ACIMED) 2012; 23 (4)
Recognizing and annotating generic drug names in biomedical literature
Gálvez C
Language: Spanish
References: 37
Page:
PDF size: 326.32 Kb.
ABSTRACT
This paper proposes a system for identification and annotation of generic drug names in biomedical texts based on finite-state models. The proposed procedure uses naming rules for generic drugs, recommended by the
United States Adoptated Names (USAN) Council, allow the classification of drugs in drug families, and linguistic engine based on finite-state techniques. Through a graphical interface, we have built analyzers able to identify to identify, classify and assign annotations to generic drug names, using affixes recommended by USAN. The evaluation corpus consists of 256 Medline abstracts. The system achieves a 99.8% precision and 92% recall. The combination of rules USAN and finite-state technology is an effective procedure for the detection, classification and tagging of generic drug names.
REFERENCES
Stockley I. Interacciones Farmacológicas. Barcelona: Pharma Editores; 2004.
Amariles P, Giraldo NA, Faus MJ. Interacciones medicamentosas: aproximación para establecer y evaluar su relevancia clínica. Med Clín. 2007;129(1):27-35.
Rodríguez-Terol A, Santos-Ramos B, Caraballo-Camacho M, Ollero-Baturone M. Relevancia clínica de las interacciones medicamentosas. Med Clín. 2008;130(19):758-59.
Thomson Healthcare Micromedex. 2012 [citado: 13-07-2012]. Disponible en: http://www.micromedex.com
Lexi-Comp, Inc. Lexi-interact. 2012 [consultado: 11-07-2012]. Disponible en: http://www.lexi.com
Minh VL, McCart GM, Tsourounis C. An assessment of free, online drug-drug interaction screening programs (DSPs). Hospital Pharmacy. 2003;38(7):662-68.
Hansten PD, Horn JR. Drug Interactions Analysis and Management. St. Louis: Facts and Comparations;2007.
Rodríguez-Terol A, Caraballo M, Palma D, Santos-Ramos B, Molina T, Desongles T, Aguilar A. Calidad estructural de las bases de datos de interacciones. Farm Hosp. 2009;33(3):134-46.
Cunningham H. Information Extraction, Automatic. Enclyclopedia of Language and Linguistics. Oxford: Elsevier;2005.
Proux D, Rechenmann F, Julliard L. Detecting Gene Symbols and Names in Biological Texts: a First Step toward Pertinent Information Extraction. Proceedings of Genome Informatics. 1998;78-80.
Thomas J, Milward D, Ouzounis C, Pulman S, Carroll M. Automatic extraction of protein interactions from scientific abstracts. Proceedings of the Pacific Symposium on Biocomputing. 2000;5:538-49.
Friedman C, Kra P, Yu H, Krauthammer M, Rzhetsky A. GENIES: a natural-language processing system for the extraction of molecular pathways from journal articles. Bioinformatics. 2001;17(1):74-82.
Hirschman L, Park C, Tsujii J, Wong L, Wu CH. Accomplishments and challenges in literature data mining for biology. Bioninformatics. 2002;18(12):1553-61.
Hearst M. Untangling text data mining. Proceedings of ACL'99: the 37th Annual Meeting of the Association for Computational Linguistic. 1999;3-10.
United States Adopted Names Council [citado: 12-07-2012]. Disponible en: http://www.ama-assn.org/ama/pub/physician-resources/medical-science/united-states-adopted-names-council/naming-guidelines/approved-stems.page?
Jurafsky D, Martin J. Speech and language processing. An introduction to natural language processing, Computational linguistics, and speech recognition. New Jersery: Prentice-Hall; 2000.
Karttunen L. Constructing lexical transducers. Proceedings of the 15th conference on Computational linguistics. Kyoto: Coling 94.1994;406-11.
Rodríguez S, Carretero J. A formal approach to Spanish morphology: the COES tools. XII Congreso de la Sociedad Española para el Procesamiento del Lenguaje Natural (SEPLN). Sevilla: SEPLN. 1996;118-26.
Siddiqui T, Tiwary US. Natural language processing and information retrieval. New Dehli: Oxford University Press; 2008.
Brill E. Transformation-Based Error-Driven Learning and Natural Language Processing: A Case Study in Part-of-Speech Tagging. Computational Linguistics. 1995;21(4):543-65.
Lavid J. Lenguaje y nuevas tecnologías. Nuevas perspectivas, métodos y herramientas para el lingüista del siglo XXI. Madrid: Cátedra; 2005.
Rodríguez H. Técnicas básicas en el tratamiento informático de la lengua. Quark. Ciencia, Medicina, Comunicación y Cultura. 2000;19:26-34.
Johnson CD. Formal aspects of phonological description. La Haya: Mouton; 1972.
Koskenniemi K. Two-level morphology: a general computational model for word-form recognition and production. University of Helsinki: Department of General Linguistics; 1983.
Hopcroft JE, Ullman JD. Introduction to Automata Theory, Languages and Computation. Reading, MA: Addison-Wesley;1979.
Karttunen L, Kaplan RM, Zaenen A. Two-level morphology with composition. Proceedings of the 15th International Conference on Computational Linguistics. Nantes, France: Coling 92. 1992.
Abney S. Partial parsing via finite-state cascades. Journal of Natural Language Engineering. 1996;2(4):337-44.
Roche E, Schabes Y. Deterministic part-of-speech tagging with finite state transducers. Computational Linguistics. 1995;21(2):227-53.
Silberztein M. NooJ Manual. 2002 [citado: 14-07-2012]. Disponible en: http://www.nooj4nlp.net
Silberztein M. Complex Annotations with NooJ. Proceedings of the 2007 International NooJ Conference. Newcastle: Cambridge Scholars Publishing. 2008:214-27.
Wilbur WJ, Hazard GF, Divita G, Mork JG, Aronson AR, Browne AC. Analysis of biomedical text for chemical names: a comparison of three methods Proceedings AMIA Annual Symposium. 1999;176-80.
Rindflesch TC, Tanabe L, Weinstein JN, Hunter L. EDGAR: Extraction of drugs, genes and Relations from the biomedical Literature. Pacific Symposium on Biocomputing. 2000;5:514-25.
Segura Bedmar I, Martínez P, Samy D. Detección de fármacos genéricos en textos biomédicos. Procesamiento del Lenguaje Natural. 2008;40:27-34.
Hersh WR, Bhupatiraju RT. TREC genomics track overview, The Twelfth Text Retrieval Conference - TREC 2003;14-23.
Hersh W, Bhupatiraju RT, Ross L, Johnson P, Cohen AM, Kraemer DF. TREC 2004 genomics track overview. The Thirteenth Text Retrieval Conference - TREC 2004;13-24.
Tanabe L, Xie N, Thom LH, Matten W and Wilbur WJ. GENETAG: a tagged corpus for gene/protein named entity recognition. BMC Bioinformatics. 2005;6(Suppl. 1):S3.
Hirschman L, Yeh A, Blaschke C, Valencia A. Overview of BioCreAtIvE: Critical assessment of information extraction for biology. BMC Bioinformatics. 2005;6:S1.