Project 01: Employee Attrition Prediction

1.3 - Exploratory Data Analysis - EDA

🔖

This step helps us to analyze the data to understand patterns, relationships, and insights.

Its time to visualize the data using matplotlib and seaborn libraries in python.

Techniques like visualizations (histograms, scatter plots), summary statistics, correlation analysis can be used to find patterns and insights of the dataset.

To find unique values in our dataset

for col in employee_data.columns:
    print(f'{col}: ', employee_data[col].unique())

output

# Employee ID:  [ 8410 64756 30257 ... 12409  9554 73042]
# Age:  [31 59 24 ... 22 32]
# Gender:  ['Male' 'Female']
# Years at Company:  [19 15 ... 50 51]
# Job Role:  ['Education' 'Media' 'Healthcare' 'Technology' 'Finance']
# Monthly Income:  [ 5390  5534  8159 ... 11854 11558 12651]
# Work-Life Balance:  ['Excellent' 'Poor' 'Good' 'Fair']
# Job Satisfaction:  ['Medium' 'High' 'Very High' 'Low']
# Performance Rating:  ['Average' 'Low' 'High' 'Below Average']
# Number of Promotions:  [2 3 0 1 4]
# Overtime:  ['No' 'Yes']
# Distance from Home:  [22 21 .... 66]
# Education Level:  ['Associate Degree' 'Master’s Degree' 'Bachelor’s Degree' 'High School'
 'PhD']
# Marital Status:  ['Married' 'Divorced' 'Single']
# Number of Dependents:  [0 3 2 4 1 5 6]
# Job Level:  ['Mid' 'Senior' 'Entry']
# Company Size:  ['Medium' 'Small' 'Large']
# Company Tenure:  [ 89  21  .... 126 128]
# Remote Work:  ['No' 'Yes']
# Leadership Opportunities:  ['No' 'Yes']
# Innovation Opportunities:  ['No' 'Yes']
# Company Reputation:  ['Excellent' 'Fair' 'Poor' 'Good']
# Employee Recognition:  ['Medium' 'Low' 'High' 'Very High']
# Attrition:  ['Stayed' 'Left']

Define input features and output target variables

X = employee_data.drop(['Employee ID', 'Attrition'])
y = employee_data['Attrition']

X.shape # (74498, 22)
y.shape # (74498,)

Introduction to AI

Introduction to ML

ML Basics

MLOPS Basics

LLM

Generative AI

ML Projects

Project 01: Employee Attrition Prediction

Project 02: LLM using Hugging Face for Beginners

MLflow

1.3 - Exploratory Data Analysis - EDA

To find unique values in our dataset

output

Define input features and output target variables

On this page