Skip to main content

Table 5 Statistical features

From: Text based personality prediction from multiple social media data sources using pre-trained language model and model averaging

Feature name

Description

Feature count

TF-IGM

Statistical method to find how important a word is in a document influenced by the class label of a document. This method is used based on the research performance comparison between TF-IDF and TF-IGM in text classification [8]

100

Sentiment analysis

The percentage of positive, negative, and neutral in the social media status. The researcher used polarity sentiment analysis approach [35] to extract the weight for positive, negative & neutral class

3

NRC Lexicon Database

Contain 14000 set of words in English and the relation of each words with eight common emotions namely anger, fear, anticipation, trust, surprise, sadness, joy, and disgust. [15]

8

Total statistical features

111