RPf-GCNs: reciprocal perspective driven fused GCNs for rumor detection on social media

The earliest detection of rumors across social media is the need to the hour in present global village. User’s are seamlessly connected in an unstructured network leading to rapid flow of information. User’s on the social media with malign intents may share defamatory content to contribute towards the fifth generation media warfare. The ingress of such defamatory content into society can result in panic, uncertainty and demoralization the peoples. Due to the huge amount of content over social platforms, the detection of malicious contents is hard. Earlier research while focuses on content profiling and flow of information, however, the reciprocal perspective of the source and following contents is missing. In this research, a novel Reciprocal Perspective fused Graph Convolutional Neural Network (RPf-GCN) is proposed. The proposed framework incorporates twin GCNs to encode both the bottom-up and top-down perspectives, enhancing the understanding of rumor propagation. Moreover convolutional operation is employed to fuse reciprocal perspective, providing a holistic view of the conversations. To validate the efficacy of the proposed framework, we conducted a series of experiments using real-world datasets, including PHEME and SemE-val. Experimentation performed illustrates that the proposed framework outperformed over various baselines in two different evaluation metrics namely Macro F1 (for PHEME 0.736, for SemEval 0.461) and Accuracy (for PHEME 0.748, for SemEval 0.658).


Introduction
Contents available on social media platforms can effectively exploit the opinions of readers.Such content can be cleverly manipulated to spread propaganda, rumors, and other misinformation.The spread of such propagandist content in society can lead to fear, uncertainty, panic, or even financial losses in trading markets.Psychological research shows that human beings are only 55-58% capable of identifying malicious content [1], which is a clear indicator of how easily the public can be deceived.Detection of such content can protect society from being jeopardized by such misinformation campaigns.Rumors on social media can be classified as true, false, or unverified based on the authenticity of the facts presented [2][3][4][5].Social media platforms lack an effective verification mechanism for the content shared by users, allowing them to spread defamatory content without hindrance.Thus, devising an automated framework for their earliest detection is the dire need of the hour in this digital era.
Usually it happens that users start commenting on a post made by a source user.Their responses, in the form of comments, show their consent, emotions, and viewpoints.Such comments or retweeted posts lead to conversation threads of different lengths, depending on the users involved in the comments.False (true) rumors mean that veracity of claim is false (true) [4].Any conversation thread is based on the root node (source post) and the threads (comments) linked to the root node.It has been observed through dataset that shallow propagation is observed by false rumors, whereas true rumors show longer multi-branch and multi-point propagation, as shown in Fig. 1.Rumors need to be attractive enough to grab people's attention; thus, they are more likely to break out at the roots.On the other hand, true rumors have no intention of spreading, thus they have a scattered pattern of spreading.
Existing researchers have analyzed the sequential, structural, and temporal aspects of conversation threads assuming that such threads are of tree-structured or non-directed graphs.However, the direction of conversation threads is often ignored.It is to be noted that direction of conversations carry patterns of rumor content flow and comprehensiveness.The source node and follower nodes have a relation from different views, i.e., top-down and bottom-up.The top-down perspective involves rumors flowing from the source post to followers' comments, while the bottom-up perspective involves the opposite.Therefore, it is necessary to exploit both views to synchronously encode both topdown and bottom-up flows.
This research proposed a reciprocal perspective aware graph convolutional neural network (GCN).The inspiration for this proposed framework is traced back to the field of computer vision, where a colored image is a seamless fusion of its three channels, with each channel representing an individual perspective of the image.The analogy would be such that each of the channel is the individual perspective of the image.The proposed framework consists of twin GCN that encodes both the bottom-up and top-down perspectives.To effectively fuse these views, a convolutional operation is employed to capture the reciprocal perspective.The key contribution of this research is proposing a reciprocal perspective-driven GCN that effectively learns and fuses the reciprocal perspectives of conversation.Moreover, a series of experiments performed on two The paper is structured with sections covering literature review (Sect.Literature review), formulation of problem statement (Sect.Problem statement formulation), detailed methodology (Sect.Proposed framewrok), experiments/results (Sect.results), conclusions (Sect.Conclusion).At last Sect.Conclusion concludes the paper followed by suggested future work.

Literature review
Online social media (OSM) is an ocean of information in the form of users and the content shared by them.The presence of fake news, rumors, and propagandist content on OSM platforms is no surprise in today's era of global digitization.Thus, detecting such malicious content is considered prudent to safeguard our society against the possible spread of panic, fear, or even economic loss.For years, academic researchers have been focusing on this domain, and ample of research work can be found in the literature.
Conventional techniques for identifying rumors typically involve extracting features from text, user profiles, and retweet propagation [31][32][33][34][35][36].Ma et al. [13] utilized time series models to capture social context changes and kernel methods to create tree structures to represent propagation patterns.Nonetheless, these methods require significant feature engineering, which is both laborious and constrained.
In order to detect rumors or propaganda content, researchers have explored various domains with the aim of testing the performance of their frameworks.Ozbay and Alatas [8] implemented fake news detection in two steps.Initially they converted the unstructured data to a structured format and then applied various supervised learning algorithms by text mining methods.Kaliyar et al. [9] proposed a convolutional neural network based framework that automatically differentiate among the rumorous and normal contents on social media.The authors proposed deep neural network based approach that detects the fake news based on the shared contents, their context and related temporal information.Faustini et al. [10] suggested a framework that is independent of platform source and language for propaganda detection.It is entirely based on text features extraction techniques.The proposed framework was examined on three different language groups with optimal results.Four learning based algorithms namely random forest, KNN, SVM and Naive Bayes were implemented.Optimum results were obtained by random forest and SVM.Wang et al. [11] proposed a GNN model for early detection of fake news that is based on enhanced textual contents representation.The representation is achieved by integration of semantic relation and sequential ordering of textual contents.Liu and Wu [12] implemented deep learning based feature extraction, CNN classifier and a learning framework to detect the fake new at an early stages aimed to stop it spreading in the social media.They used datasets that were generated from twitter and Weibo.
The use of deep neural networks for rumor detection has been investigated by researchers.In particular, Ma et al. [14] used RNN to capture the sequential representations of textual posts at time interval.Liu et al. [15] combined RNN and CNN to extract user profile feature and deduce veracity of posts.Lu et al. [16] suggested the hybrid model that incorporates user profiles and source tweets.Yu et al. [24] employed a hierarchical transformer framework to learn local and global interactions among shorter subthreads of longer conversation threads.However, these methods do not take into account the structural characteristics of conversation theams depicted in Fig. 1, that provide insights into how posts spread on social-media.
Ma et al. [4] introduced a model based on Recursive Neural Network (RvNN) that employs deep learning techniques to capture significant patterns from textual content and propagation structures.The model acquires latent representations of tweets within propagation trees through learning.Likewise, in their work, Lin et al. [25] employed undirected graph neural networks alongside multiple attention mechanisms to improve the learning of representations for individual posts and their interactions.However, considering the diverse relationships between nodes, the methods used encoded tree-structured graphs with only a single edge type.Consequently, in order to address this concern, Bian et al. [6] directed their attention towards both the top-down and bottom-up propagation relationships among nodes.Building upon Bian's work, Wei et al. [7] made further enhancements by eliminating unreliable relationships between nodes within rumor conversation threads.However, despite these advancements, these methods still face a limitation in effectively integrating multiple reciprocal views within rumor conversations to distinguish between false and true rumors from a global perspective.
Graph Neural Networks have gained popularity in recent years due to their ability to learn representations of structured data with high performance.They have been applied in various tasks, such as text classification [26] and recommendation systems [27].Representative examples of graph neural networks include GCN [28] and GAT [29].Authors in [37] utilized GCNs to highlight the ubiquitous presence of social circles in online social networks and their potential to reveal users' behavioral preferences.Drawing upon insights from information diffusion studies, substantial impact of social circles on the dynamics of rumor propagation, including its speed, reach, and content has been explored.Lin et al. [38] introduces a groundbreaking zero-shot framework, to identify rumors spanning diverse domains and languages.The approach begins by representing social media rumors as a collection of diverse propagation threads.Using GCN it incorporates domain-invariant structural features extracted from the propagation threads.This inclusion involves capturing structural position representations within influential community responses.The article [39] introduces a novel rumor detection model named "graph contrastive learning with feature augmentation" (FAGCL).This model aims to enhance rumor detection by introducing noise into the feature space and facilitating contrastive learning through the construction of asymmetric structures.FAGCL starts by using user preferences and news embeddings as the initial features of the rumor propagation tree.It then employs a graph attention network to iteratively update node representations.Sun et al. [40] introduced a novel approach called the "Knowledge-guided Dual-consistency Network." to detect rumors that incorporate multimedia content and focuses on capturing inconsistencies at two distinct levels: the cross-modal level and the content-knowledge level.It enables the robust learning of multi-modal representations, even when visual modality information is missing.To facilitate this, a unique token is introduced to differentiate between posts that contain visual modality and those that do not.
These methods are designed for single-view network representation, whereas in reality, there exist multi or reciprocal view networks, where each view corresponds to a different perspective of conversation thread.Consequently, considerable research efforts [30] have been devoted to the exploration of multi-view graph learning, with a specific focus on integrating node representations from each view into a comprehensive global node representation.In contrast, the current study investigates the fusion of features from reciprocal perspective graphs into a unified graph feature representation vector, aiming to detect rumors.

Problem statement formulation
The rumor detection task can be as follows.The social media is full of conversation threads that can be represented as where t i represents the i th conversation thread and p is the total number of threads existing in the dataset.Each t i is composed of a source post s i and various responses r i ; s.t 0 < i < n i − 1 .Thus the overall structure of any i th thread can be structured as Here the term G i = {N i , E i } is a tree structure that is formed by the source post and responses, wherein are the number of edges connecting the nodes e i and e j which are part of the conversation thread.In this article, the rumor detec- tion problem is tackled as supervised learned mapping function f = T i → L where T i is a conversation thread having label L.

Proposed framework
The proposed framework is based upon reciprocal perspective-driven GCNs that classify the source post as either "rumorous" or not.Graph Convolutional Neural Networks (GCNs) have emerged as a powerful tool for analyzing structured data represented in the form of graphs or networks.Initially developed for tasks related to semi-supervised node classification and link prediction, GCNs have found application in various domains, including social network analysis, recommendation systems, and, more recently, rumor detection in social media.
GCNs build upon the concept of convolutional neural networks, originally designed for regular grids such as images, and extend it to irregular data structures like graphs.The core idea is to learn node representations by aggregating information from neighboring nodes, enabling the network to capture complex relationships within the graph.
In a typical GCN, a node's representation is updated by combining features from its neighbors.This process can be mathematically defined through a propagation rule that considers both the node's own features and its adjacent nodes' features.The depth of these layers or iterations of propagation can be adjusted to control the model's capacity and ability to capture higher-order dependencies.The success of GCNs in graph-related tasks stems from their capacity to capture local and global information efficiently.This is especially relevant when analyzing social media data, where conversations and information propagation occur in a complex network structure.
An overview of the proposed framework is depicted in Fig. 2. It learns features from the reciprocal view of both the source and response posts.The proposed model undertakes three sequential tasks that involve reciprocal perspective embedding generation, fusion, and content classification.We will provide directions on how to apply the proposed framework to determine the veracity extent of a source post s i in a conversation thread T i .For the sake of simplicity, the subscript i will be removed in the following paragraphs.
Taking into consideration the graph G = {N , E} of any conversation thread, the graphs can be represented by its adjacency matrix A p→q ∈ R n×n ; A p→q = 1 if node p has responded or commented to post of node q.Here the point to ponder is that though the adjacency matrix A p→q reflects the view of node p only who has responded to node q and leaves out the stance of node q towards node p.The proposed methodology attempts to fill up this gap by including the reciprocal perspective of both nodes for each other.The inclusion of the reciprocal perspective in our model is motivated by the need for a more comprehensive understanding of social media rumors.By analyzing conversations from both the source and follower viewpoints, we gain a holistic view of how rumors propagate and evolve.This perspective-driven approach allows us to capture a broader range of indicators and enhances the model's performance in detecting malicious content.It provides a more robust and nuanced analysis of rumor dynamics, making it a valuable addition to proposed framework.
Thus the adjacency matrix A p→q can be assumed to have the descending perspective of the graph from parent to child nodes.Similarly another adjacency matrix A q→p can also be constructed using the same graph G = {N , E} wherein adjacency matrix A q→p would have the ascending perspective of the same graph from child to parent nodes.we have employed the concept of dropping some random percentile of edge [20] on G as of earlier researchers aimed to avoid overfitting of the proposed model.Both the adjacency matrices of descending perspective A p→q and ascending perspective A q→p share the same features matrix F for each node.The feature matrix for each node is prepared by using top-3000 words embedding generated by bi-LSTM.We use bi-LSTM because it generates the textual embedding Fig. 2 Proposed framework while considering the text in both directions.Subsequently, the node embeddings are updated using two concurrent graph convolutional neural networks (GCNs).GCN's convolutional operations in each layer can be represented by Eqs. 1 and 2.
Here σ replicates the non-linear activation function typically it is ReLu, A p→q is the Adjacency matrix of (n × n) , F is the feature matrix, p→q ∈ R n×D 2 are the hidden features representations at layers l + 1 and l + 2 respectively.Lastly, W (l) ∈ R D 1 ×D 2 and W (l+1) ∈ R D 1 ×D 2 are the trainable weight matrices of layers l and l + 1 respectively.Similar to descending perspective, the ascending perspective calcula- tions can also be replicated by Eqs. 1 and 2 which would result in After obtaining these reciprocal perspectives of the conversation threads for each node, we combine them for further processing.As discussed earlier, the proposed framework is inspired by computer vision, and we consider these reciprocal perspectives as two channels of an RGB image, with each node corresponding to a pixel.Therefore, the feature representation of conversation threads can be expressed in the form: Influenced by the convolution operations of image channels in computer vision, H in (3) can be considered as bi-channel input to a convolution operation, that involves various filters The convolution operation is applied to a window of y nodes which leads to generation of feature maps in accordance of the following operations as shown below: Where W f and b are optimized parameters, ⊗ indicates the bi-channels convolutional operations and σ is the ReLU activation function applied to the output.Thereafter, fol- lowing feature map is obtained: Max-over pooling layer operation, max (∂ f ) , is applied to capture the most important feature, i.e., the highest value of ∂ f amongst the feature map i.e ∂f = max(∂ f ) .The maxi- mum feature values produced by all filters are concontinated and final representation of the conversation thread is obtained as shown in Eq. 6, where F is the total number of filters applied. (1) p→q =σ A p→q FW (l)   (2) After obtaining the final feature representation of the conversation thread, the feature map is fed into a fully connected layer with a softmax activation function.The purpose of this layer is to predict the probabilistic values whether the source post is a rumor or not.The functionality is represented as: The aim of the loss function being used during the training of the proposed framework is to minimize the cross-entropy among predicted and ground truth values.
Where O i , is the feature representation of source post s i present in conversation thread T i .

Experimentation results
This section elaborates on the experimental setup and results obtained from benchmark datasets, namely PHEME and SemEval.It also includes a comparative analysis of the proposed framework against its baselines.Furthermore, we conducted various ablation studies exploring the effects of different factors on the proposed framework.

Experimentation details
The proposed framework has been implemented on a desktop system installed with Ubuntu 18.04 LTS (Bionic), which has 16 GB of RAM, an AMD Ryzen 7 3700x 8-core processor, and an NVIDIA GeForce RTX 3080 Ti GPU.During the training process, we set the dropout rate to 0.5, F to 64, and D 1 and D 2 to 64 as well.Additionally, the learning rate was set to 1 × 10 −4 , and the batch size was set to 64.The framework was trained for 100 epochs with L2-regularization and a weight penalty of 0.001.The Adam optimizer was used during the training process.

Dataset description
In order to examine the effectiveness and validate the performance of the proposed model, two benchmark datasets were selected.The main reason for selecting these datasets is that they are versatile in nature and contain the requisite details of users and responses to source posts.Such details are primarily required to create conversation threads with a reciprocal effect between source and responses.The statistics of both benchmark datasets, post-preprocessing, are given in Table 1.
PHEME is a dataset based on rumors and non-rumors, consisting of nine real-time incidents that occurred between 2012 and 2015.The original incidents are comprised of tweets from a source user, to which various followers responded.The tweets are provided in a JSON file with 19 features corresponding to each tweet.In order to avoid over-fitting and ensure the convergence of our proposed model for robust outcomes, we used k-fold cross-validation.In this method, k-1 folds are used for training, while 1 fold is used for testing.For the PHEME dataset, we set k to 9.This led us to use one event of the PHEME dataset for testing, while the remaining events were used for training the (7) Similarly, SemEval has tweets covering 10 events in 325 conversation threads.Following the same methodology as the PHEME dataset, we set k to 10, but this time, we used 2 events for testing and the remaining 8 for training the model.Upon a detailed analysis of both datasets, we concluded that both datasets have an issue of class imbalance.Therefore, to have a fair analysis and a wholesome comparison, we chose the Macro − F 1 score and accuracy as evaluation metrics in our case.

Comparative analysis
In this section, we will explore the primary reasons for the outperformance of the proposed model.To conduct a comparative analysis with other state-of-the-art baselines, we have selected the following frameworks proposed in the literature: • BranchLSTM [21]: It considers successive branches in a discussion thread and utilizes an LSTM-based framework for classifying the stance of rumors.• TD-RvNN [4]: A recursive neural network framework driven by top-down propagation is used for rumor detection on social media.• Hierarchical GCN-RNN [22]: The joint venture between graph convolutional and recurrent neural networks leverages the sequential and structural properties of conversational threads.• PLAN [23]: A model based on a randomly initialized transformer is used to encode conversational threads for rumor detection.• Hierarchical Transformer [24]: An extended BERT-based framework is proposed that learns the sub-thread interactions, followed by encoding their global interactions of all posts.The proposed model captures these interactions based on a Transformer layer.• Bi-GCN [6]: A GCN-based model which formulates high-level representations on the bases of bottom-up and top-down views of conversation threads.• ClaHi-GAT [25]: A GCN-based model formulates high-level representations based on both the bottom-up and top-down views of conversation threads.• EBGCN [7]: Bi-GCN variant that adjust weights of unreliable relations through Bayesian method.
The proposed models used for detecting rumors can broadly examine conversation threads in two different aspects: structure-wise and branch-wise.Based on experimentation, it can be concluded that structure-wise exploitation of conversation threads gives better outcomes compared to branch-wise.We further validated this assumption by conducting a detailed comparative analysis of BranchLSTM with other models.BranchLSTM decomposes the conversation thread into branches of the tree and then encodes each branch to learn its feature representation.However, since LSTM is well-known for sequential data processing, it misses out on the abstract level representation of rumors that is embedded into the structural analysis of the thread.On the other hand, frameworks like Hierarchical Transformer, PLAN, and BiGCN evaluate the structural information of the conversation threads and perform better than BranchLSTM.Such models learn the structural representation of conversation threads which are critical for rumor detection.The reason for this criticality is such that the propagation of information in a social media platforms follow specific pattern.Detailed analysis of Table 2 leads us to the conclusion that the performance of deep learning models is also affected by the perspective from which the conversation thread is analyzed.It is evident from Table 2 that EBGCN, Bi-GCN and ClaHi-GAT perform better among the frameworks that analyze the structural information of the conversation threads.These frameworks consider only a single perspective of the conversation threads, thus learning only the singleton view of the conversation and leaving out the reciprocal viewpoint.Despite the single perspective analysis, ClaHi-GAT outperforms its two competitors.The probable reason for this could be the attention heads (post-based, graph-based, and event-based) employed in ClaHi-GAT.The complex attention mechanism can extract meaningful information from conversation threads that can be helpful in detecting rumors in real-world scenarios.
RPf-GCN outperforms its baseline frameworks.The core reason for its better performance is that during training, it learns the reciprocal perspectives of conversation threads.This enables the proposed framework to learn indicators from multiple views of both the source and the respondents.By fusing these reciprocal perspectives, the proposed model can obtain a comprehensive view of rumors from a global standpoint, leading to a significant enhancement in the performance of our model.These observations indicate that by incorporating the reciprocal perspective structural data of conversation threads, our suggested model can adeptly identify rumors in real scenarios.

Modular analysis
In this subsection, we conducted an ablation study to examine the effectiveness of each component in RPf-GCN.We removed each component from the entire model and assessed its impact on the overall performance of the proposed framework.The term "Combined" denotes the complete model with all of its sub-modules.The ablated models include (1) "-Conv, " which is RPf-GCN without the CNN.This approach is similar to the ones used in Bi-GCN and EBGCN, where the mean-pooling operation is applied to the p → q and q → p GCN to get their representations, followed by concatenation of both features for prediction.(2) "-(q → p) , " which is RPf-GCN that does not cater to the response-source perspective of the conversation thread.Similarly, (3) "-(p → q) , " is the variant of the proposed framework, which considers the source-response perspective of the thread.( 4) "-Dir, " where the conversation thread is modeled as an undirected tree structure encoded by a two-layer GCN added with a CNN submodule.( 5) "GCN, " which is the basic version of GCN, i.e., RPf-GCN without considering the reciprocal perspective.
Conclusions can be drawn from Table 3.First of all, it is evident that RPf − GCN −(Dir) performs better than both of its variants, which only con- sider the single perspective of conversation threads, i.e., RPf − GCN −(p→q) and RPf − GCN −(q→p) .But the proposed model RPf − GCN with all modules combined outperforms three of the RPf − GCN −(Dir) , RPf − GCN −(p→q) and RPf − GCN −(q→p) .This validates that taking into account both p → q and q → p reciprocal views lead to superior performance of the model.
Secondly, RPf − GCN −(p→q) experiences a significant drop in performance com- pared to RPf − GCN −(q→p) , indicating that the source-to-respondent ( p → q ) propagation perspective is better at reflecting the characteristics of rumors than the respondent-to-source ( q → p ) dispersion view.
Thirdly, when the Conv component is removed, both RPf-GCN and RPf − GCN −(Dir) experience severe drops in Macro-F1 and Acc on both datasets.This demonstrates the effectiveness of the Convolutional module in feature representation for rumor detection.It not only fuses the reciprocal perspective information effectively but also captures enriched features for identifying rumors while considering only one perspective.

Filter size analysis
We conducted an experiment to analyze the impact of varying the filter size in the Conv sub-module on rumor detection.Figure 3b displays the plot of macro-F1 score against various filter sizes, revealing that our suggested model achieved optimal performance on both datasets with a window size of 1.As the filter size increased, performance initially dropped, followed by marginal improvement with an increase in filter size.This aligns with our intuition that unlike the relationship between adjacent pixels in an image, there may not be a direct correlation between posts in chronological order.Thus, a larger window size led the model to learn more noise that hindered its performance.However, increasing the window size slightly enhanced the correlation between users, resulting in some improvement in the model's performance.Additionally, since there are few participating users and contents in the early stages of rumor propagation, a smaller window size was more effective for early stages of rumor detection.Consequently, the proposed framework holds good for early rumor detection.

Drop rates effect
In Fig. 3a, we tested the performance of RPf-GCN by varying the dropedge from 0 to 0.6.The performance showed a gradual increase, peaking at 0.5 before subsequently declining.Conversation threads often contain unreliable relationships, resulting in significant error accumulation that decreases model robustness [7].Increasing the rate of dropping the edge leads to a decrease in the number of unreliable edges, improving model performance and robustness by enabling it to learn more compelling features.However, clipping too many edges ultimately leads to a decline in performance.Based on our experimentation, we determined that this reasonable rate produces the best results.

Conclusion
Social media is exploited by anti-state agents and state-sponsored groups for disinformation campaigns with political and strategic objectives.Detecting malicious content on social media is crucial.This paper introduces a novel deep learning framework based on a reciprocal perspective-driven graph convolutional neural network to effectively detect social media rumors.It treats rumor conversation threads as color images, integrating source and follower perspectives as channels and graph nodes as pixels.The model uses two concurrent GCNs to capture discriminating features from each perspective.A convolutional operation captures consistent and complementary information, resulting in a comprehensive conversation representation.Experimental results on real-world datasets show that our RPf-GCN significantly outperforms existing methods.The core reason for its superior performance is its ability to learn reciprocal perspectives, providing a comprehensive view of rumors and enhancing overall model performance.

Fig. 1
Fig. 1 Propagation pattern of a: False rumor, b: True rumor

Fig. 3
Fig. 3 Performance of RPf-GCN a Effect of Dropedge rate, b Effect of Filter size

Table 1
Statistics of datsets post pre-processing

Table 2
Performance comparison

Table 3
Performance comparison with and without different sub-modules