An Introduction to Data Mining
|
Rate this Article: (4.0 Stars | 41 Votes) |
Data mining is a technique which treats data methodically so as to analyze data and its behavioral observations. The goal of data mining is to extract important information from data which was previously not known. It can help in the recognition of certain patterns or trends in the data.
The biggest challenge faced is not only to get the information but also to search through it to find connections and data points that were unknown previously. While data mining is not good at telling you "why" certain data behaves in a certain way, it is an excellent tool for telling you "how."
Comparing Data Mining to the Six Sigma Methodology
In comparison, the Six Sigma methodology can explain why data does behave in a certain way.
Six Sigma is famous for its data driven approach. During the Measure and Analyze phases of this methodology, rigorous steps are followed to gather and perform analysis on various data. These steps typically incorporate such well known tools as root cause analysis and statistical hypothesis testing.
The Measure and Analyze phases help to identify why things are the way they are. This knowledge in turn can be used to establish linkages between inputs and outputs; these identified linkages can then help to carry out improvements.
Since quality of results is as good as the quality and treatment of data, it is highly recommended to follow the data mining approach religiously while working on the Measure and Analyze phases.
While Six Sigma in itself contains some of the data mining steps, it does not provide detailed know-how of these steps.
Data Mining Steps
Data mining consists of four steps: clustering, classification, regression and association rule learning.
However, one more important step is required before actual data mining can start: pre-processing, in which a target data set must be assembled. A common source for data is usually an organization’s database, which often contains certain garbage or irrelevant data-points. Therefore, the target dataset has to be cleaned. Cleaning can remove data with noise and missing information points. It is also necessary to validate integrity of data points in a set. These are essential steps to obtain sanity in the results.
Once data cleansing is performed, clustering follows. This is the task of discovering structures and groups in the data which are similar in some or many ways. It may not require previous knowledge about the given data.
Each data point is then classified (classification) in order to generalize the data and create new data out of it. This often helps in narrowing down assessment points, thereby reducing complexity of overall data analysis.
The next step is regression. Regression is an attempt to find a (typically mathematical) function which models the data with the least possible errors. This further generalizes the dataset.
Following regression, association rule learning searches for relationships between variables. For example, a supermarket may gather data on customer buying habits. Using association rule learning, marketers can then determine which products customers frequently buy together and subsequently use this data for marketing purposes.
Data mining always follows one final critical step, which is results validation. Results validation verifies the patterns that are produced by the data mining algorithms for the wider data set. Not all patterns found by the data mining algorithms are necessarily valid, but they often display strong or weak co-relation.
A Data Mining in Financial Services Example
A popular example of data mining is use of past behavior data to rank customers and approaches for various offers. For instance, financial institutions have often used these techniques in order to decide what approach to take when offering new loans and credit cards to customers.
In any financial institution, the company’s internal database captures an abundant number of customer characteristics, such as card balances, number of open loans, and whether or not a customer has ever responded to a loan offer through a phone call, e-mail or direct mail (an example of clustering).
Data mining thus helps establish common characteristics within available customer data, which can subsequently establish a predictive model (an example of generalization and association rule learning). The financial institution may then use this knowledge to create a new campaign with the hope of increasing its customer base and annual revenue.
Conclusion
With this primary insight into data mining, it is evident that Six Sigma methodology consists of almost all the data mining steps as part of its rigor, which can further yield better results when practitioners explicitly perform the data mining steps in the Measure and Analyze phases.
-
How to Maximize Business Process Reengineering: People, Process, and Tools -
CIGNA's Quality System Evolution: Ready for the Next Step -
How eflexgroup.com Sped Up Its Claims Process Using Lean Six Sigma -
Deploying Six Sigma within Health Care: A Roadmap to Success -
Minitab 16 Review: A Compelling Software Upgrade for the Six Sigma Community -
Creating a Culture of Innovation -
Providing Structure to Continuous Process Improvement Implementation -
Lessons Learned in a Lean Health Care Journey -
Hiring for Culture or Skill? -
Does This Org Chart Make My Department Look Fat?
* = required.
While stages in data-mining are more-or-less same as explained above, there might be some variation depending upon the stage of project and final expectation. |
Thanks for explaining what data mining is all about. For mining data they can use Six sigma methodology (meaning DMAIC or whatever depending upon the stage)?? |
-
Contributor: Abhijit Gupta -
Contributor: Anand Tamboli -
Contributor: Arul Aruleswaran -
Contributor: Bernardo Tirado -
Contributor: John Jarrett -
Contributor: John Jarrett
-
Contributor: Debashis Sarkar -
Contributor: Debashis Sarkar -
Contributor: Debashis Sarkar
-
Contributor: Genna Weiss -
Contributor: Genna Weiss -
Contributor: Genna Weiss
-
Presenter: Sofie Blakstad
Following on from the great success of the Exchange in Canary Wharf in 2009, we are pleased to announce the 6th annual Business Process Excellence in Financial Services Exchange, taking place at Hilton Tower Bridge, London on the 21st and 22nd September 2010.
As focus shifts from transformation towards growth and value creation in 2010, a clear pattern is emerging. Agile financial services institutions with best in class process and operations, responsive to market conditions and with the clarity and speed of process to deliver, are those taking full advantage of current strengths as well as new growth opportunities. For CIOs, COOs, Heads of IT, Directors of BPM and Process Excellence, delivering value by successfully streamlining and integrating business processes and rolling out technology programmes to enhance operations is imperative.
The Business Process Excellence in Financial Services Exchange provides the perfect setting for senior executives to debate, network and understand effective business strategies at a level that befits their role. This exclusive, invitation only, event tackles challenges and helps find solutions through a serious of conference sessions, roundtable discussions and one-one-one meetings with pre-selected solution providers.




Replies (0)