E-mail Sentiment Analysis using VADER

Syed Suleman Qutb
2 min readNov 30, 2020

Since mid-90’s, modes of correspondence and communication have tremendously evolved to services like real-time chat rooms, blogs, discussion forums and plethora of social networking sites. However, despite emerging communication means, E-mail channel is still considered to be a reliable platform which is easy to use, fast, asynchronous in nature and keeps a searchable record of correspondence. Hence, majority of organizations and government bodies heavily rely and prefer using e-mails for their routine business communication.

Nearly, 3.9 billion users globally, use electronic mail every day in their respective homes and offices making it an indispensable part of our daily life. Where it is the most common way of business workspace communication, it also provides means for cyber-bullying, defamatory remarks and exchanging of inappropriate words over internet which is detrimental to any business. Naturally, observing all correspondence and characterizing them into categories is humanly impossible. This is where Sentiment Analysis, a machine learning tool, sub-field of Natural Language Processing (NLP) plays a pivotal role in identification and categorization of emotion expressed within the body of an e-mail.

Keeping the above problem statement in view, we have implemented Sentiment Analysis of E-mails in our Splunk app using NLP Text Analytics. After cleaning the body content with regular expression, we have cleaned the text with Beautiful Soup4 wrapper to remove html/xml tags and then we carried out lemmatization of the content.

Finally, the content is passed through Valence Aware Dictionary for Sentiment Reasoning (VADER) which relies on a dictionary that maps lexical features to emotion intensities commonly referred as sentiment score. The scores are segregated into headings of Neutral, Positive and Negative according to their semantic orientation. This score ultimately helps in defining of entire Sentiment Analysis of an E-mail. List of common Positive and Negative words by VADER are defined as under: -

The SPL query used for Sentiment Analysis in our EUNOMATIX MLDETECT app is appended as under: -

For further details and functionality of our ML based detection framework and SPL queries, please contact EUNOMATIX, info@eunomatix.com.

--

--

Syed Suleman Qutb

Cybersecurity Solutions Architect @ EUNOMATIX, USA. EUNOMATIX specializes in out-of-the-box Cyber Detection & Preemption.