Research Output

2021 2021 0.0 0.0 0.2 0.2 0.4 0.4 0.6 0.6 0.8 0.8 1.0 1.0
Now showing 1 - 1 of 1
No Thumbnail Available
Publication

SimText: a text mining framework for interactive analysis and visualization of similarities among biomedical entities

2021 , Marie Macnee , PEREZ PALMA, EDUARDO ESTEBAN , Sarah Schumacher-Bass , Jarrod Dalton , Costin Leu , Daniel Blankenberg , Dennis Lal , Jonathan Wren

Abstract Summary Literature exploration in PubMed on a large number of biomedical entities (e.g. genes, diseases or experiments) can be time-consuming and challenging, especially when assessing associations between entities. Here, we describe SimText, a user-friendly toolset that provides customizable and systematic workflows for the analysis of similarities among a set of entities based on text. SimText can be used for (i) text collection from PubMed and extraction of words with different text mining approaches, and (ii) interactive analysis and visualization of data using unsupervised learning techniques in an interactive app. Availability and implementation We developed SimText as an open-source R software and integrated it into Galaxy (https://usegalaxy.eu), an online data analysis platform with supporting self-learning training material available at https://training.galaxyproject.org. A command-line version of the toolset is available for download from GitHub (https://github.com/dlal-group/simtext) or as Docker image (https://hub.docker.com/r/dlalgroup/simtext/tags.). Supplementary information Supplementary data are available at Bioinformatics online.