Data Mining

Definition

Data mining is the process of identifying patterns and relationships in large datasets and extracting this information. This is accomplished with statistics and/or machine learning techniques. Data mining differs from data analysis in that it is approached without a hypothesis. Data mining often involves the automated collection of large quantities of data to “extract” previously unknown or interesting patterns in data.

Examples

An example of the use of data mining in healthcare is looking for patterns in large sets of EHR data to identify harmful drug interactions.

Similar Terms

Text Mining

Tools

The tidyverse is a heavily used, well-supported set of libraries for R programming with functions that are very useful for data cleaning, analysis, and visualization.

Pandas is a library for Python for data cleaning and analysis, with some basic data visualization functionality.

Relevant Literature

Sadiku, M. N. O., Shadare, A. E., & Musa, S. M. (2015). DATA MINING: A BRIEF INTRODUCTION. European Scientific Journal, ESJ, 11(21). Retrieved from https://eujournal.org/index.php/esj/article/view/6017

Gupta, S. (2022). “Introduction to Data Mining: A Complete Guide.” Springboard Blog. Retrieved from https://www.springboard.com/blog/data-science/data-mining

Data Mining

Definition

Similar Terms

Relevant Literature

Contact Us

Regional Medical Libraries