Cookie Notice
This site uses cookies for performance, analytics, personalization and advertising purposes.
For more information about how we use cookies please see our Cookie Policy.
Manage Consent Preferences
These cookies are essential in order to enable you to move around the website and use its features, such as accessing secure areas of the website.
These are analytics cookies that allow us to collect information about how visitors use a website, for instance which pages visitors go to most often, and if they get error messages from web pages. This helps us to improve the way the website works and allows us to test different ideas on the site.
These cookies allow our website to properly function and in particular will allow you to use its more personal features.
These cookies are used by third parties to build a profile of your interests and show you relevant adverts on other sites. You should check the relevant third party website for more information and how to opt out, as described below.
SEMMA and CRISP-DM are both process models used in the field of data mining and machine learning to guide the steps involved in developing predictive models and extracting useful insights from data.
While they share some similarities, they also have distinct differences. Below is a comparison of SEMMA and CRISP-DM.
CRISP-DM: Developed in the late 1990s, CRISP-DM is a comprehensive and widely recognized framework for data mining projects. It was designed to provide a structured approach to guide the entire data mining process, from understanding business objectives to deploying models.
SEMMA: SEMMA was developed by SAS (a software company) as a framework for their data mining software. It focuses primarily on the modeling phase and is more specific to SAS’s software suite. However, it has also been used more broadly in the context of data analysis and modeling.
CRISP-DM: CRISP-DM defines six distinct phases:
CRISP-DM covers the entire data mining project lifecycle, including understanding business goals, data collection and preparation, model building, evaluation, and deployment.
SEMMA: SEMMA outlines five key phases:
SEMMA focuses primarily on the modeling phase, offering guidance on data sampling, exploration, modification, modeling, and model assessment.
CRISP-DM: CRISP-DM is considered a more flexible and comprehensive framework, suitable for a wide range of data mining and machine learning projects.
SEMMA: SEMMA is more specific to SAS software and is often used as a companion to other, more comprehensive methodologies like CRISP-DM.
CRISP-DM is widely adopted and has extensive documentation and support from the data mining community. It is generally seen as a practical and effective methodology for data mining projects.
SEMMA, while useful for model-building within the SAS environment, may be less familiar and less widely adopted outside of the SAS user base.
CRISP-DM is a more comprehensive and widely accepted data mining process model that covers the entire project lifecycle.
SEMMA, on the other hand, is a more specialized framework, primarily focusing on the modeling phase and is closely associated with SAS software.
The choice between the two depends on the specific needs and tools of a given project, with CRISP-DM as a more general and flexible approach.
© Starburst Data, Inc. Starburst and Starburst Data are registered trademarks of Starburst Data, Inc. All rights reserved. Presto®, the Presto logo, Delta Lake, and the Delta Lake logo are trademarks of LF Projects, LLC
Up to $500 in usage credits included