Model Selection Example

Problem Statement

Use the multiple models to predict the values for gear based on the other variables. The data for the exercise is shown in the Data tab.

How to perform analysis

Step 1: Open Sigma Magic

Click on the Sigma Magic button on the Excel toolbar.
Click on the New button to create a new project.

Step 2: Add the analysis template

Click on the Tool Wizard to add the analysis template.
Click on Analytics and then Model Selection.

Step 3: Specify analysis options

A new worksheet will be added to your workbook. Analysis Setup will be automatically opened, in the setup tab specify the survey results.

Click on Data to specify the data required for this analysis.

Click on the Train the software will let you pick the options for training the given model. Training is a step where we split the data into groups a train data set and a test data set.

Click the Verify tab to ensure all the inputs are okay and shown in a green checkmark.

Step 4: Generate analysis result

Click OK and then click Compute Outputs to get the final results.

Interpretation of Results

The Parallel Random Forest model (M1) has been identified as the best-performing model.
It achieved an accuracy of 81.37% and a Kappa statistic of 72.71%, which indicates strong classification agreement.
Three models were compared:

M1 (Parallel Random Forest) → 81.37% Accuracy, 72.71% Kappa
M2 (Random Generalized Boosted Model) → 81.12% Accuracy, 54.74% Kappa
M3 (Random Classification and Regression Tree) → 80.39% Accuracy, 57.42% Kappa

M1 performed the best in both accuracy and Kappa, making it the most reliable choice.
The model was trained as a classification problem.
Features used: gear, carb, vs, am, disp, qsec.
Resampling methods were used to improve model robustness.
Statistical differences between models were analyzed using hypothesis testing.
The lowest variation and best statistical fit was found in the Parallel Random Forest model.
The faults section suggests some model assumptions were tested.
84 resampling bootstraps were used to validate model stability.
The resampling process was randomly initialized with different seeds (e.g., 23, 29).
Hyperparameters were tuned, ensuring optimized performance.
Parallel Random Forest was finalized as the best model.
It achieved an accuracy of 81.37%, which is the highest among the compared models.
The Kappa statistic of 72.71% indicates a strong classification agreement.
Further improvements could involve fine-tuning the number of trees, depth, and feature selection for better accuracy.

Related Articles
Model Selection Overview
Model selection is the process of choosing the best statistical, machine learning, or econometric model from a set of candidate models based on their performance on given data. It is crucial for ensuring that a model generalizes well to unseen data ...
Model selection frequently asked questions
What is Model Selection? Model selection refers to the process of choosing the most appropriate statistical or machine learning model based on a given dataset and specific business objectives. The tool provides various techniques such as regression ...
Boosted Model Example
Problem Statement Use the XGBoost model to predict the values for gear based on the other variables. The data for the exercise is shown in the Data tab. How to perform analysis Step 1: Open Sigma Magic Click on the Sigma Magic button on the Excel ...
Bayesian Model Example
Problem Statement Use the Naïve Bayes approach to estimate the values for the missing cyl values (last 2 rows). How to perform analysis Step 1: Open Sigma Magic Click on the Sigma Magic button on the Excel toolbar. Click on the New button to create a ...
Coaching Model Example
Problem Statement Use the coaching model to coach a person on one issue. Pick a coachee who has an issue he/she wants to work with and use this form to help coach this person on the issue they are facing. How to perform analysis Step 1: Open Sigma ...

Model Selection Example

Model Selection Example

Problem Statement

How to perform analysis

Interpretation of Results

Related Articles

Model Selection Overview

Model selection frequently asked questions

Boosted Model Example

Bayesian Model Example

Coaching Model Example