Skip to main content

Table 3 Twitter dataset distribution

From: Text based personality prediction from multiple social media data sources using pre-trained language model and model averaging

Dataset Twitter (manually collected)
Type Train Test Validation
Label No Yes No Yes No Yes
Openness 14600 17511 3128 3753 3129 3753
Conscientiousness 23666 8445 5072 1809 5072 1810
Extraversion 9210 22901 1974 4907 1974 4908
Agreeableness 14348 17763 3074 3807 3075 3807
Neuroticism 17712 14399 3796 3085 3796 3086