Regression Models Example

Regression Models Example

Problem Statement

Determine a regression model between input X (advertisements) and Y (sales). The data for this exercise is given in the Data tab. Fit a regression model between the input(s) and the output? Is this model statistically significant? What is the R^2 value?

How to perform analysis

Step 1: Open Sigma Magic
  1. Click on the Sigma Magic button on the Excel toolbar.
  2. Click on the New button to create a new project.
Step 2: Add the analysis template
  1. Click on the Tool Wizard to add the analysis template.
  2. Click on Analytics and then Regression Models.

Step 3: Specify analysis options
A new worksheet will be added to your workbook. Analysis Setup will be automatically openedin the setup tab specify the survey results.


Click on Data to specify the data required for this analysis. 



Click o Model to specify the regression model you want to fit to the data.


Click Options to specify the plot options and any results that should be displayed on worksheet.


If you need to make changes to the charts, specify the optional settings in the Charts tab.

 Labels:

  • Add a title for the chart.
  • Label the X-axis and Y-axis appropriately.

Appearance:

  • Adjust colors, font sizes, or other visual elements as needed.
  • Enable/disable gridlines or background shading. 

Click the Verify tab to ensure all the inputs are okay and shown in a green checkmark.


Step 4: Generate analysis result
Click OK and then click Compute Outputs to get the final results.


Interpretation of Results

  • The R² value is 77.86%, indicating that approximately 78% of the variation in Sales is explained by Ads.
  • The Adjusted R² is also close, confirming that the model generalizes well without overfitting.
  • The fitted equation is: Sales = 537.92 + 6.30 × Ads.
  • This means for every one-unit increase in Ads, Sales increase by 6.30 units, assuming other factors remain constant.
  • The p-value for Ads is < 0.0001, indicating it is a statistically significant predictor of Sales.
  • The F-test (42.67, p < 0.0001) confirms the overall model significance.
  • The histogram and probability plot of residuals suggest residuals are normally distributed, satisfying regression assumptions.
  • The residuals vs. run order and vs. fitted values plots do not show obvious patterns, meaning no major autocorrelation or heteroscedasticity issues.
  • The VIF value is 1, indicating no multicollinearity issues.
  • This means the predictor variable (Ads) is not highly correlated with other The model suggests that advertising significantly impacts sales.
  • Given the high R², businesses can use this model for predicting sales based on advertising budgets, though external factors (not included in the model) might also influence sales.
    • Related Articles

    • Regression Models Overview

      Regression models are statistical techniques used to understand relationships between dependent (target) and independent (predictor) variables. They help in predicting outcomes, identifying trends, and making data-driven decisions. Regression is ...
    • Regression models frequently asked questions

      What types of regression models are available? Linear Regression – For modeling relationships between dependent and independent variables using a straight line. Multiple Linear Regression – Extends linear regression with multiple predictor variables. ...
    • Prototype Models Overview

      Prototype models serve as foundational frameworks for developing and testing analytical solutions. These models help organizations gain insights, make data-driven decisions, and improve processes. Below is an overview of key prototype models in ...
    • Bayesian Models Overview

      Bayesian models are a class of statistical models based on Bayesian probability, which provides a probabilistic framework for updating beliefs based on new evidence. These models use Bayes' theorem to update prior knowledge with observed data to ...
    • Boosted Models Overview

      Boosted models are a class of ensemble learning methods used in machine learning to improve predictive accuracy by combining multiple weak learners (typically decision trees) into a strong model. These models are built sequentially, where each new ...