The aim of this project is to analyze the sentiment of a song to be positive, negative or neutral based on its lyrics.
The songs lyrics in the form of text is first pre-processed using nltk library. The various steps such as punctuation removal,tokenization, stop-word removal, etc is done. From nltk sentiment analyzer is imported and it is used to detect the sentiment of the song lyrics. The sentiments are stored in the form of labels. The dataset is then split into training and testing data. They are converted to bad-of-words and tfidf sparse matrix is created using the bag-of-words vector. Then sklearn's svm model is imported. The svm model is fitted/trained with the training data tfidf sparse matrix and it predicts labels using testing data tfidf sparse matrix. On evaluation of the model using a classification report, the accuracy was found to be 82%.
2 songs with their lyrics were taken and sentiment analysis library was used to label their actual sentiments.
Then the svm model was used to predict their sentiments.
The 2 outcomes were similar.
To improve model accuracy
To use different classification models such as Naive_Bayes classifier (Multinomialnb)
Python 38689 Version:v2.0.2 License: Permissive (BSD-3-Clause)