Skip to main content

Table 5 Statistical features

From: Text based personality prediction from multiple social media data sources using pre-trained language model and model averaging

Feature name Description Feature count
TF-IGM Statistical method to find how important a word is in a document influenced by the class label of a document. This method is used based on the research performance comparison between TF-IDF and TF-IGM in text classification [8] 100
Sentiment analysis The percentage of positive, negative, and neutral in the social media status. The researcher used polarity sentiment analysis approach [35] to extract the weight for positive, negative & neutral class 3
NRC Lexicon Database Contain 14000 set of words in English and the relation of each words with eight common emotions namely anger, fear, anticipation, trust, surprise, sadness, joy, and disgust. [15] 8
Total statistical features 111