Sentiment analysis (also known as opinion mining) is an automated process (of Natural Language Processing) to classify a text (review, feedback, conversation etc.) by polarity (positive, negative, neutral) or emotion (happy, sad etc.).
Sentiment analysis helps businesses to identify customer opinion toward products, brands or services through online review or feedback.
In this tutorial I will explain on below points:
- What is sentiment analysis?
- Where you can apply sentiment analysis
- Types of Sentiment Analysis
- What is VADER?
- How VADER works
- Command to install VADER
- Hands on VADER sentiment analysis in Python
Where to use Sentiment analysis
There are so many areas where sentiment analysis is helping to take better decision. Like..
Politics:In politics sentiment analysis is using to determine opinion of population towards a particular decision made by some party or people’s interest towards a party. You can also find good or bad thing of a party. Sentiment analysis can also help to predict which party is going to win the election.
Business:In recent time most of the companies use feedback and review to improve their quality of service and quality of product by applying sentiment analysis.
Types of Sentiment Analysis
There are mainly two types of sentiment analysis
1. Polarity Detection: Predict a review whether it is positive, negative or neutral
2. Emotion Detection: Predict a reviewer’s emotion while writing the review like sad, happy, angry etc.
There are so many tools available for doing sentiment analysis in Python. In this post I will show you how to do sentiment analysis in Python using a package called VADER. In this post I will only concentrate on Polarity detection using VADER.
How VADER works
VADER (Valence Aware Dictionary and sEntiment Reasoner) is a lexicon and rule-based sentiment analysis tool that is specifically designed to extract sentiments expressed in social media. It is fully open-sourced under the MIT License.
Lexicon is a list of words. For example positive lexicon is a list of all possible positive words (like good, amaze, appreciable etc.). Similarly negative lexicon is a list of all possible negative words (like bad, abolish, ambush etc.)
Rule based approach of sentiment analysis each words of your product review text pass though those two dictionaries or lexicons (positive and negative word list) and count the number of positive and negative words in your review text. Based on that number you can determine whether your review is positive or negative.
VADERuses same technique but with a smart way. It also included Emoji to determine sentiment. In social media Emojis are often used to express emotion. Thus VADER is extremely useful for doing social media sentiment analysis.
Command to install vaderSentiment
$ pip install vaderSentiment
VADER Sentiment Analysis in Python
Below is the code to do sentiment analysis using VADER in python.
from vaderSentiment.vaderSentiment import SentimentIntensityAnalyzer
# function to print sentiments
# of the sentence.
# Define SentimentIntensityAnalyzer object of VADER.
SID_obj = SentimentIntensityAnalyzer()
# calculate polarity scores which gives a sentiment dictionary,
# Contains pos, neg, neu, and compound scores.
polarity_dict = SID_obj.polarity_scores(sentence)
print("Raw sentiment dictionary : ", polarity_dict)
print("polarity percentage of sentence ", polarity_dict['neg']*100, "% :: Negative")
print("polarity percentage of sentence ", polarity_dict['neu']*100, "% :: Neutral")
print("polarity percentage of sentence ", polarity_dict['pos']*100, "% :: Positive")
print("Overall polarity percentage of sentence", end = " :: ")
# Calculate overall sentiment by compound score
if polarity_dict['compound'] >= 0.05 :
elif polarity_dict['compound'] <= - 0.05 :
Now let’s test how VADER sentiment performs
# Simple text
vader_sentiment_scores('I do not like flipkart')
# Emoji test (refer to image)
# Complex text
vader_sentiment_scores('Your service has never been good')
What is compound score in VADER
Compound Score in VADER
The ‘compound’ score is calculated by summing the valen ce scores of each word in the lexicon after that adjusted according to the rules, and then normalized to be between -1 (most extreme negative) and +1 (most extreme positive). This is the most useful metric if you want only one score of sentiment for a given text/ sentence.
Pos, Neg, Neu Score in VADER
These are the most useful metrics if you want to know each (positive, negative and neutral) score for a given sentence. If you sum all (positive, negative and neutral) score will get result 1.
If you don’t have training data for sentiment analysis you must use rule based approach. You can make your own rule or you can simply use tool like VADER. There are multiple tools for sentiment analysis available like: StanfordCore NLP, TextBLOB etc.