Data Mining


Data mining is the process of identifying patterns and relationships in large datasets and extracting this information. This is accomplished with statistics and/or machine learning techniques. Data mining differs from data analysis in that it is approached without a hypothesis. Data mining often involves the automated collection of large quantities of data to “extract” previously unknown or interesting patterns in data.   


An example of the use of data mining in healthcare is looking for patterns in large sets of EHR data to identify harmful drug interactions.

Similar Terms

Text Mining

The tidyverse is a heavily used, well-supported set of libraries for R programming with functions that are very useful for data cleaning, analysis, and visualization.

Pandas is a library for Python for data cleaning and analysis, with some basic data visualization functionality.


Further Resources

Sadiku, M. N. O., Shadare, A. E., & Musa, S. M. (2015). DATA MINING: A BRIEF INTRODUCTION. European Scientific Journal, ESJ, 11(21). Retrieved from

Gupta, S. (2022). “Introduction to Data Mining: A Complete Guide.” Springboard Blog. Retrieved from

Search for a Term

Send us your feedback or suggestions for new terms

Contact information
4 + 9 =
Solve this simple math problem and enter the result. E.g. for 1+3, enter 4.
This question is to prevent spam submissions. Contact for any accessibility issues.