Have a data mining project on the horizon? These 6 steps make up the Cross Industry Standard Process for Data Mining (CRISP-DM) and will help make it awesome!
- Gain an understanding of the business problem you are trying to solve. Are the business requirements well defined?
- Get to know the data. What data is available? Is it complete? What data is needed? Now is also a good time to identify any data quality problems.
- Prepare the data. Data is rarely clean or in the right format for your modeling tools. This step can be time consuming.
- Create your model(s). – Pick your modeling tool and build your model – Linear Regression, Classification, Clustering. Several techniques can be used to solve the same data mining problem. Now might also be a good time to revisit Step 3 if the data isn’t quite right.
- Evaluate your results. Are the results meaningful? Do they solve the problem you identified in Step 1? Ultimately, a decision on the use of the results should be made.
- Deploy your model! How should the model be deployed? What steps should be taken to maximize the benefit of the model and results?
Do you use a different process? I’d love to hear about it. Please leave a comment.