Multiclass emotion prediction using heart rate and virtual reality stimuli

Emotion prediction is a method that recognizes the human emotion derived from the subject’s psychological data. The problem in question is the limited use of heart rate (HR) as the prediction feature through the use of common classifiers such as Support Vector Machine (SVM), K-Nearest Neighbor (KNN) and Random Forest (RF) in emotion prediction. This paper aims to investigate whether HR signals can be utilized to classify four-class emotions using the emotion model from Russell’s in a virtual reality (VR) environment using machine learning. An experiment was conducted using the Empatica E4 wristband to acquire the participant’s HR, a VR headset as the display device for participants to view the 360° emotional videos, and the Empatica E4 real-time application was used during the experiment to extract and process the participant's recorded heart rate. For intra-subject classification, all three classifiers SVM, KNN, and RF achieved 100% as the highest accuracy while inter-subject classification achieved 46.7% for SVM, 42.9% for KNN and 43.3% for RF. The results demonstrate the potential of SVM, KNN and RF classifiers to classify HR as a feature to be used in emotion prediction in four distinct emotion classes in a virtual reality environment. The potential applications include interactive gaming, affective entertainment, and VR health rehabilitation.

long-term accumulation of acquired negative emotions [2].Physiological impulses in the human brain can be derived from the autonomous nervous system (ANS) and are not knowingly or deliberately triggered [8].Baig and Kavakli [3], discuss and examine the classification of emotions using electrocardiography (ECG) and signals for electrodermography (EDG)."Introduction" section offers an overview of the model of emotion and Method of emotion classification researchers used in their studies.Although a previous study using electroencephalography (EEG) signals showed high classification accuracies of over 80%, emotion classification studies by EDG and ECG have also shown competitively high classification accuracies.
Valence and arousal can be classified and considered as emotions [1].The quality of emotions is presented as valence, which ranges from negative (unpleasant) to positive (pleasant).While the level of quantitative activation is presented as arousal, which ranges from high (aroused) to low (not aroused).Valence represents the quality of emotion, ranging from unpleasant (negative) to pleasant (positive) whereas arousal denotes the quantitative activation level, from aroused (high) to not aroused (low).An example of positive valence is happiness, negative valence is disgust, while low arousal is related to boredom, and high arousal is induced by surprise [7].
A bi-dimensional perspective is formed when valence and arousal are combined, the bipolar model is a widely used classification model from Russell, known with the moniker Circumplex model of emotions [1].Russell's model of two-dimensional that is based on arousal and valence is presented in Fig. 1.
The scale of valence-arousal by Russell's is arguably the most widely used model in researches related with emotion, in particular research, the dataset from Database for Emotion Analysis (DEAP) using Physiological Signals) is used in emotion classification [5].The observation Russell's model is good at recognizing emotions that is positive or negative within a 2D space model of emotion, together with states of arousal that are low or high, the problem arises when distinguishing emotions within a similar quadrant is attempted.When distinguishing the emotions such as anger and fear is attempted, it is both within the same quadrant of high arousal with negative valence according to the 2D model, that is where the emotion model with higher-dimension was proposed [4].
The signal provided by HR is valuable in researching the changes physiologically of the heart in different scenarios.This has been used widely in research involving treating heart disease, epilepsy, and arrhythmia.The application of HR has been used in the evaluation of psychological and mental conditions.Furthermore, HR signal information can recognize the emotional stress of humans.HR can play an important part when attempting to study human emotions when certain types of emotional stimuli are presented to them.The reliable recognition of emotions has multiple applications within neuromarketing, personalized entertainment, affective computing fields, virtual rehabilitation, and virtual learning.Hence, this paper's objective is investigating whether the use of HR signals alone with Virtual Reality (VR) as the stimuli is sufficient as a machine learning feature in classifying emotions in four different classes.
To the best of our knowledge, there has been no attempt to investigate HR for emotion prediction in four distinct emotion classes.Next, we briefly present two related studies that utilized HR for emotion prediction but both papers only predicted two distinct emotion classes with a neutral label.Table 1 above shows the comparison of HR signals used for emotion prediction in two related studies.In [11], the participants were 25 subjects while the stimuli used was the CEVS video dataset showing a various range of emotions to the participants.A number of features were extracted from the original HR signals including amplitude, slope and information entropy among others.Using a Gradient Boosting Decision Tree (GBDT) classifier, the best accuracy of 84% was obtained.However, this prediction was conducted for two emotions only, which were happy and sad that covers only the valence axis, with one neutral label.In [9], the study included 5 participants only who were asked to record their emotions via an Android application when they experienced an emotion.It was shown from their experiments that the best accuracy of 79% can be obtained by using the Discrete Wavelet Transform (DWT) to extract features from the original HR signals features with a time window of length 180 and using SVM as the classifier.Again, the study only conducted prediction along the valence axis, predicting emotions clustered in either positive or negative emotions only (with one neutral label).In both of these related studies, no results were presented for four-class emotion predictions, which is the focus of this study.Moreover, it was not stated if the studies conducted were for intra-subject classification or intersubject classification, which is clearly differentiated in this study, where the implications between these two types of experimental setup have very significant outcomes on the classification accuracy, as will be demonstrated in the experimental section's discussion of results.The following presents the outline of the paper.The first section introduces the aims and background of this research brief, the second section presents the methodology including the stimuli for emotion, the test group demography, the hardware of the experiment, and the setup of the experiment.The third section presents the analysis and results of the experiment, where the experimental findings using HR as the feature for prediction was reviewed.Finally the fourth section concludes this research brief.

Overall experimental flow
The experimental methodology reflects the setup procedures for the in-lab experiment.The experiment framework was prepared based on the four general steps in the signal processing activity as shown in Fig. 2.
This experiment starts with the data acquisition process, a total of 20 healthy participants with no history of heart cases volunteered.During the data acquisition process, each of the participants wears a wearable device called the Empatica E4 which is used to detect and record their HR signals, which will be further explained in Subsection 2.3.It is then paired with a virtual reality (VR) headset for the participant to view 360° videos, where the videos are shown to the participants according to the four classes of emotions from Russell's model as explained earlier in Fig. 1.The setup, hardware, and demography are further discussed in "Experimental setup, hardware and demography" section.After the data acquisition phase is completed, the recorded data were then uploaded to the cloud through the Empatica Connect smartphone application.The smartphone application is used to view, record and process the HR reading in real-time.
In the pre-processing phase, the data recorded is downloaded from the cloud, where each of the participant's data is then labeled synchronously with the time of the videos to match the emotional responses to the HR signals.This is done to all the 20 participant's data set.Next is the classification phase, where three classifiers were used to compare which achieved the best performance in terms of accuracy.Two types of classification processes were executed during the experiment, an intra-subject classification, which involves classification within a specific individual and inter-subject classification, which classifies the combined data across all participants.The next subsection will discuss further the VR content selection in this experiment.

The selection content for VR
VR was used in this experimental setup as the device to present the stimuli emotional content to the subjects, the emotional content is in the form of 360° videos which are used in evoking responses that are emotional from the subjects.The four quadrants previously shown in Fig. 1 evoke a specific emotion in the content chosen.Short videos of four were presented to the subjects for every quadrant to evoke a high and responses that are sustained by the subjects for the emotions focused.In each completed quadrant, subjects are

Data AcquisiƟons
Pre-Processing ClassificaƟon Result given a rest period of 10-s where visual stimuli are not present, this is to allow the subjects to reset their emotional state to a baseline in preparation for the following quadrant's.Hence, 16 videos in total were stitched and compiled together for the overall experiment across each emotional quadrant of four.The experiment flow presented in Fig. 3 shows the video time and quadrant shown to the subjects.The total duration for the overall content that was compiled and shown to the subject is 6 min and 5 s that includes the baseline period to reset the subject emotional state.
Prior to starting an experiment, first, the subject was given a thorough explanation of the experiment flow, then after they have understood the flow, they were then asked to sign a consent form.Next, the VR headset is mounted to the subject head while an adjustment of the straps was done to ensure the subject level of comfort.Then, the earphone is plugged in the subject ear to provide a deeper immersive experience.

Experimental setup, hardware and demography
This experiment has a total of 20 subjects (12 males and 8 females) that participated in ranging from 20 to 28 years old, consisting of individuals who are working or studying.Before the experiment started, all subjects were briefed about the side effect that they may potentially experience such as motion sickness, headache, dizziness, and nausea.These are the side effect that VR usage is known to have.
The Empatica E4 wearable is the wristband used as the hardware in recording the subject heart rate, it is a wearable device that is medical-grade used in data acquisition related to physiological.It is a wearable device, hence it is non-invasive towards the subject when obtaining the HR via a photoplethysmography sensor which works by detecting the activity within the blood volume, optically, the skin indicates the pulse of the user, which measures the HR at a fixed frequency of 64 Hz by default of the wearable device and is non-customizable.The device is usually placed on the right or left wrist of the subject.After the wristband is placed, the earphones, VR Headset, the subject is then placed in a seating position, when the 360 O video is started, the heart rate is simultaneously captured wirelessly via the E4 real-time application.
In Fig. 4, the wearable device shown is the Empatica E4 placed on the subject wrist in Fig. 4a, and in Fig. 4b is the Empatica E4 Real-time application used during the experiment to acquire the participant's heart activity.The Empatica E4 real-time application enables a real-time view of the HR and EDG signals via smartphone.There are four available data to stream via the application, including, heart rate, skin activity, acceleration and temperature.The application uploads the data recorded automatically to the server which can be downloaded from the E4 Connect page.Other than that, it is used to view the session and battery level of the E4 wearable device.The subject HR data will be acquired using the application on the smartphone shown in Fig. 5 during the data acquisition with the Empatica E4.
The VR headset shown in Fig. 6 was used for the participants to watch the video content for each quadrant in 360 degrees.The HTC Vive VR headset which allows the subject to be immersed completely into their surrounding, according to the content at that time being played.Within the 360 O video environment, subjects can freely move around by changing the trajectory of their focus.

Average BPM, max, and min
After the completion of the experiment, the data acquired through the real-time application from the Empatica E4 wearable was transferred to the cloud, the transferred data can then be downloaded in Common Separated Value (CSV) format to be The 20 subject's data acquired from the experiment are shown in Fig. 7.This figure presents the heart rate average, maximum, and minimum for intra-subject (individual).The range of HR activity with the lowest Beat Per Minute (BPM) is from the third subject with 54.6 bpm, subject 18 had the highest bpm with 110.6 bpm, while the 20 subject's entire total average bpm was 76.1 bpm.
Figure 8 shows the max, min, and average BPM of all 20 subjects data collectively.The average for 20 subjects is 76.1 BPM, while the max is 110.6 BPM and min is 54.6 BPM.

SVM, KNN and Random Forest classifiers on individual and overall-subject classification
Intra-subject classification refers to the data collected from the subject individually, the same subject data is then used for testing and training, whereas inter-subject classification refers to the data collected from the 20 subjects overall rather than individually that is then used for testing and training over all subjects.Python was used in the classification of the intra-subject data individually, while the inter-subject data was grouped collectively from the 20 subjects into a single CSV file.The experiment used cross-validation of ten-fold.
The input feature used in the machine learning classifier is the HR data collected from the subjects, an attempt to use HR data solely as the feature in emotion classification of four different classes.SVM, RF and KNN were used as the approach for machine learning by using Python, this is to recognize the emotion classification that is based on the heart rate activity retrieved from the experiment with the subject that is the direct response to the 360 O video stimuli shown in the experiment.
The formula for calculating the classification accuracy using the three classifiers, namely SVM, KNN, and RF, is as follows: The classification for both inter-subject and intra-subject accuracy results of the experiments using SVM, KNN, and RF classifier is presented in the following figures.Figure 9 shows the accuracy results for SVM, KNN, and RF classifiers for intra-subject classification.SVM's accuracy results range from the lowest accuracy of 45.4% from participant 16 and to the highest accuracy of 100% from participant 17; KNN's results range from the lowest accuracy of 54.5% from participant 20 and to the highest accuracy of 100% from participant 17; while RF's results ranged from the lowest accuracy of 36.3% from participant 4 to the highest accuracy of 100% from participant 17.All three classifier has one subject that achieved 100% from participant 17, while the lowest-performing subject is participant 4 with 36.3% with the RF classifier.
The results for inter-subject classification results using SVM, KNN, and Random Forest are shown in Fig. 10.The inter-subject classification generated accuracies of 46.7 for SVM, 42.9% for KNN and 43.3% for Random Forest, which shows that intersubject classification of emotions using HR is significantly harder compared to intrasubject classification, which to be expected since the HR varies considerably across different individuals when exposed to emotional stimuli.

Confusion matrix analysis for inter-subject classification
Table 2 below shows the confusion matrix for 20 inter-subject classifications across the four-class emotional quadrants.The principal diagonal represents the percentage of the successful recognition of each class.
The result shows the prediction of the classification for SVM for where the emotion is classified in four-classes starting with low arousal/positive valance (LA/PV), low arousal/negative valence (LA/NV), high arousal/negative valance (HA/NV) and (HA/ PV).Table 2 shows across participants, the most difficult quadrant to predict was LA/ PV at only 40% while the most successful prediction came from the HA/NV quadrant at 96%.Overall, it also appears that classifying negative valence emotions were easier than positive valence emotions.

Results comparison against related studies
Table 3 presents the result comparison of using HR signals.
This table shows the result comparison of other results besides the result obtained from this experiment, where the highest accuracy was 84% while this experiment achieved between 46.7% and 100% depending on whether it was predicted as an intrasubject classification problem or an inter-subject classification problem.As previously explained, neither of the other two published studies explained their experimental protocol in terms of whether it was focused on solving the intra-subject or inter-subject classification.Furthermore, both of these studies only classified two distinct emotions focusing solely on the valence axis only, which is clearly a far simpler classification task compared to our more challenging and arguably more useful approach of classifying into four distinct classes across both the valence and arousal dimensions of emotions according to Russell's model of emotions.

Conclusion
This paper prime objective was investigating the use of HR signals could be the sole feature used in the classification of human emotions into four distinct quadrants according to Russell's Circumplex Model of Emotions when emotional stimuli were presented to participants via a VR environment.From the experimental results, it was found that four-class intra-subject emotion classification yielded accuracies ranging from 45.4% to 100% for SVM, 54.5% to 100% for KNN, and 36.3% to 100% for Random Forest while four-class inter-subject emotion classification yielded accuracies of 46.7% for SVM, 42.9% for KNN and 43.3% for RF.These results show that HR

Fig. 1
Fig. 1 Valence and arousal based on the two-dimensional model by Russell's

Fig. 3
Fig. 3 VR presentation that evokes the subject's emotional responses entire flow

Fig. 4 aFig. 5
Fig. 4 a On the right wrist is the Empatica E4; b The 1st Party application for Empatica E4 Real-time

Fig. 6
Fig. 6 Subject with the HTV vive VR headset mounted Fig. Subject average, Max and Min heart beat

Fig.
Fig. HR activity for all subjects collectively for Max, Average, and Min