Pre-Process Data Example

Pre-Process Data Example

Problem Statement

While collecting data on Length, Width, Height, we are missing one value. Can you pre-process this data to address this issue.

a) Delete this record and make the set complete

b) Use the central value to estimate the missing value

How to perform analysis

Step 1: Open Sigma Magic
  1. Click on the Sigma Magic button on the Excel toolbar.
  2. Click on the New button to create a new project.
Step 2: Add the analysis template
  1. Click on the Tool Wizard to add the analysis template.
  2. Click on Analytics and then Pre-Process Data.


Step 3: Specify analysis options
A new worksheet will be added to your workbook. Analysis Setup will be automatically openedin the setup tab specify the survey results.



Click on Data to specify the data required for this analysis.



Click the Verify tab to ensure all the inputs are okay and shown in a green checkmark.



Step 4: Generate analysis result
Click OK and then click Compute Outputs to get the final results.



Interpretation of Results

1.  Presence of Missing or Invalid Data (NA Columns):

  • Some columns labeled as NA might indicate missing or placeholder values.
  • Certain rows contain irregular values like -1.5, which could be an error or outlier.
  • These missing or incorrect values may need imputation or removal.

2.  Repetitive Data Patterns:

  • Some rows have identical values in the NA columns, such as 3707553304 and 7864787372621, which appear multiple times.
  • This suggests either redundancy in data collection or duplication in entries.

3.  Possible Outliers in Height Column:

  • The Height column values are mostly 10, but some have 10.5 or 11.
  • This variation might be acceptable, but further analysis is needed to determine if it is significant or an anomaly.

4.  Format and Structure Issues:

  • The dataset is structured in a tabular format, but some values appear inconsistent.
  • Checking data types (numeric or categorical) is necessary to ensure proper analysis.

5.  Potential Data Cleaning Required:

  • Data validation is needed to confirm whether NA columns contain useful information or should be removed.
  • If NA represents missing values, imputation (mean/median/mode) or deletion may be required.

6.  Need for Normalization and Transformation:

  • If numerical columns have large-scale differences (e.g., NA values appearing as large numbers), scaling techniques like Min-Max or Standardization may be necessary.
  • Encoding categorical variables (if any) should also be considered before applying machine learning models.

    • Related Articles

    • Pre process data frequently asked questions

      What is Pre-Process Data ? Pre-Process Data in Sigma Magic is a feature used to clean, transform, and prepare raw data for analysis. It includes handling missing values, outliers, duplicates, normalization, and encoding to ensure data quality and ...
    • Pre-Process Data Overview

      Data preprocessing is a crucial step in data analysis and machine learning, as raw data often contains inconsistencies, missing values, and noise that can impact model performance. The process involves several key steps: 1. Data Collection Gathering ...
    • Control Plan Example

      Problem Statement Select a project that you have just completed and create a control plan for this project. # Critical Process Variables 1 All employees trained in the new process 2 Employee performance management scores How to perform analysis Step ...
    • Process Mapping Example

      Problem Statement Create a process map for the credit approval process. The key process steps are shown below. Feel free to add other steps as appropriate. # Key Steps in Process 1 Receive a call 2 Check if the credit is ok 3 If credit is ok, approve ...
    • VSM Example

      Problem Statement Create a VSM for the sub-assembly process with the following details given in the attachment. How to perform analysis Step 1: Open Sigma Magic Click on the Sigma Magic button on the Excel toolbar. Click on the New button to create a ...