Employee Attrition Analysis & Prediction using NLP
by bhoyarpurvi Updated: Jun 15, 2022
DATASET - Kaggle Uploaded dataset TABLEAU - Statistical Representation of results EMPLOYEE ATTRITION PROBLEM ABSTRACT - Nowadays employee attrition is one of the key problem in the today's scenario. Attrition is said to be gradual reduction in number of employees through resignation, death and retirement. When a well-trained and well-adapted employee leaves the organization for any of the reason, it creates an empty space in an organization . It creates a great difficulty for a Human resource personnel to fill the gap that has occurred. This study helps in knowing why attrition occurs, reasons for employee attrition, challenges faced by managers in retaining employees and also suggest some measures in retaining employees. This Project is concerned with the problem of employee attrition in the industry. We have prepared dataset manually and analyzed the reviews of employees . Under this capstone Project we have built a comparison between service based and product based companies and their attrition rates and their causes. Our final result is prepared using Data Mining , Preprocessing, Feature Extraction , Natural Language Processing on reviews , Visualization of the abstracted data .Our analysis will also give a clear idea of what are the main reasons of attrition in real life . Our prediction model gives an output of whether the employee will leave the company or not depending upon the situations like work life balance , employee personal and professional information company type , etc. Lastly , we have used a visualization tool TABLEAU in order to perform data visualization to show the insights.
Dataset sources - ambitionbox.com , indeed.com , glassdoor.com It contains more than 600 entries of different companies and employees and their resultant attrition type.
The basic libraries being used for preprocessing the dataset Pre-processing refers to the transformations applied to our data before feeding it to the algorithm. Data Preprocessing is a technique that is used to convert the raw data into a clean data set.
Flexible and powerful data analysis / manipulation library for Python, providing labeled data structures similar to R data.frame objects, statistical functions, and much more
Python 38552 Version:v2.0.2 License: Permissive (BSD-3-Clause)
The fundamental package for scientific computing with Python.
Python 23663 Version:v1.25.0rc1 License: Permissive (BSD-3-Clause)
Statistical data visualization in Python
Python 10737 Version:v0.12.2 License: Permissive (BSD-3-Clause)
matplotlib: plotting with Python
Python 17497 Version:v3.7.1 License: No License
ML library for model building
Includes regression and classification models under supervised learning
scikit-learn: machine learning in Python
Python 54472 Version:1.2.2 License: Permissive (BSD-3-Clause)
Kit Solution Source
Jupyter Notebook 0 Version:Current License: No License
Natural language processing libraries
It provides a large number of algorithms to build machine learning models. It has excellent documentation that helps makes it easier to learn. Natural language processing helps computers communicate with humans in their own language and scales other language-related tasks.
Simple, Pythonic, text processing--Sentiment analysis, part-of-speech tagging, noun phrase extraction, translation, and more.
Python 8586 Version:0.7.0 License: Permissive (MIT)
💬 Python scripts to parse Messenger, Hangouts, WhatsApp and Telegram chat logs into DataFrames.
Python 893 Version:Current License: Permissive (MIT)
WORD FREQUENCY ANALYSIS LIBRARY