PretoxTM: a text mining system for extracting treatment-related findings from preclinical toxicology reports
Abstract
Over the last few decades the pharmaceutical industry has generated a vast corpus of knowledge on the safety and efficacy of drugs. Much of this information is contained in toxicology reports, which summarise the results of animal studies designed to analyse the effects of the tested compound, including unintended pharmacological and toxic effects, known as treatment-related findings. Despite the potential of this knowledge, the fact that most of this relevant information is only available as unstructured text with variable degrees of digitisation has hampered its systematic access, use and exploitation. Text mining technologies have the ability to automatically extract, analyse and aggregate such information, providing valuable new insights into the drug discovery and development process. In the context of the eTRANSAFE project, we present PretoxTM (Preclinical Toxicology Text Mining), the first system specifically designed to detect, extract, organise and visualise treatment-related findings from toxicology reports. The PretoxTM tool comprises three main components: PretoxTM Corpus, PretoxTM Pipeline and PretoxTM Web App. The PretoxTM Corpus is a gold standard corpus of preclinical treatment-related findings annotated by toxicology experts. This corpus was used to develop, train and validate the PretoxTM Pipeline, which extracts treatment-related findings from preclinical study reports. The extracted information is then presented for expert visualisation and validation in the PretoxTM Web App.
Journal of publication
Journal of Cheminformatics vol. 17, Article number: 15 (2025)
Contributors
Javier Corvi, Nicolás Díaz-Roussel, José M. Fernández, Francesco Ronzano, Emilio Centeno, Pablo Accuosto, Celine Ibrahim, Shoji Asakura, Frank Bringezu, Mirjam Fröhlicher, Annika Kreuchwig, Yoko Nogami, Jeong Rih, Raul Rodriguez-Esteban, Nicolas Sajot, Joerg Wichard, Heng-Yi Michael Wu, Philip Drew, Thomas Steger-Hartmann, Alfonso Valencia, Laura I. Furlong & Salvador Capella-Gutierrez
External sources