Text Mining


Text mining is the process of  extracting meaning from unstructured text data. Examples of this type of data are documents, websites, and social media, as well as semi-structured text formats like JSON, XML, and HTML. Natural Language Processing (NLP) techniques, including topic modeling and sentiment analysis, and Machine Learning (ML) techniques can be employed to explore text and better understand hard to see relationships in the data.


Important information on patients is contained within unstructured text data such such as physician's notes and clinical histories. NLP can be used to parse this data and text mining can then help find patterns in a patient's records that can provide a care team with critical information for improving treatment outcomes.

Similar Terms

Text data mining
Text analysis
Natural Language Processing (NLP)

NLP ToolKit is a popular suite of Python tools used for text analysis: Jablonski, J. (n.d.). "Natural Language Processing With Python's NLTK Package". Realpython.com https://realpython.com/nltk-nlp-python/

Further Resources

Provides overview of text mining as well as some common text mining tasks: IBM (n.d.). "What is Text Mining". IBM.com https://www.ibm.com/topics/text-mining

Search for a Term

Send us your feedback or suggestions for new terms

Contact information
3 + 14 =
Solve this simple math problem and enter the result. E.g. for 1+3, enter 4.
This question is to prevent spam submissions. Contact nwso@hshsl.umaryland.edu for any accessibility issues.