A novel multi-source information-fusion predictive framework based on deep neural networks for accuracy enhancement in stock market prediction

The stock market is very unstable and volatile due to several factors such as public sentiments, economic factors and more. Several Petabytes volumes of data are generated every second from different sources, which affect the stock market. A fair and efficient fusion of these data sources (factors) into intelligence is expected to offer better prediction accuracy on the stock market. However, integrating these factors from different data sources as one dataset for market analysis is seen as challenging because they come in a different format (numerical or text). In this study, we propose a novel multi-source information-fusion stock price prediction framework based on a hybrid deep neural network architecture (Convolution Neural Networks (CNN) and Long Short-Term Memory (LSTM)) named IKN-ConvLSTM. Precisely, we design a predictive framework to integrate stock-related information from six (6) heterogeneous sources. Secondly, we construct a base model using CNN, and random search algorithm as a feature selector to optimise our initial training parameters. Finally, a stacked LSTM network is fine-tuned by using the tuned parameter (features) from the base-model to enhance prediction accuracy. Our approach's emperical evaluation was carried out with stock data (January 3, 2017, to January 31, 2020) from the Ghana Stock Exchange (GSE). The results show a good prediction accuracy of 98.31%, specificity (0.9975), sensitivity (0.8939%) and F-score (0.9672) of the amalgamated dataset compared with the distinct dataset. Based on the study outcome, it can be concluded that efficient information fusion of different stock price indicators as a single data source for market prediction offer high prediction accuracy than individual data sources.

email and social networking sites are growing exponentially [4]; and the stock market is one place where several Terabytes and Petabytes of information is generated daily from these sources.
However, stock market data's ubiquitousness makes effective information fusion in market analysis a challenging task [1,5]. Notwithstanding, stock market information is multi-layered and interconnected [5]; hence, the ability to make intelligence of data, by fusing them into new knowledge would offer distinct advantages in predicting the stock market [6]. Therefore, Multi-source Data-Fusion (MDF) has become a key area of interest in recent studies in this field [7]. i.e., MDF aims to attain a global view of all factors affecting the stock price movement and make the best investment decision. Nonetheless, the ability to fuse all these factors (stock price indicators) into useful information is hindered by the fact that these factors are generated from several sources in different formats (numerical or text).
Primarily, stock-related information can be clustered into two, namely quantitative (numerical) dataset and qualitative (textual) dataset. The quantitative dataset includes historical stock price and economic data, based on these; the analyst predicts stock price movement [2]. Nevertheless, Zhang et al. [8], argued that quantitative stock market data could not convey the complete information concerning companies' financial standings. Hence, qualitative information such as the economic standing of the firm, the board of directors, employees, financial status, balance-sheets, firm's yearly income-reports, regional and political data, climatic circumstances like unnatural or natural disasters enshrouded in the textual descriptions from various data sources can be effectively be used to predicts stock price movement or complementary quantitative data [8]. However, Nti et al. [9] pointed out that there is a limited size of qualitative information on the stock market from developing countries; hence, it is inadequate to solely depend on qualitative information to predict future stock price from undeveloped and developing countries.
Thus, both quantitative and qualitative information sources are very vital in developing better and highly accurate predictive models in the stock market [1,10]. Therefore, it is reasonable to acquire comprehensive data of both textual and numerical to predict the future stock price of a firm. On the other hand, studies [2] show that few studies (11% out of 122 studies) in stock market prediction attempted to fuse both (quantitative and qualitative) to predict future stock price movement. Moreover, as indicated earlier, the stock market is influenced by several factors; therefore, relying on a single data-source might not be adequate to make accurate predictions.
To examine in totality, and quantify the effects of these factors on stock price movement; we proposed a multi-source information-fusion stock market prediction framework. The framework is based on deep hybrid neural networks architecture (CNN and stacked LSTM), named IKN-ConvLSTM. Specifically, we present an information-fusion technique for amalgamating three quantitative and three qualitative stock-related information sources. It is the first study to put forward such a comprehensive information fusion framework for stock market prediction to the best of our knowledge. Additionally, we try to obtain the effects of individual information-source on stock market price movement. Thus, we detect the significant factors that have decisive impacts. These factors may be collective sentiments or economic variables or some vital features in the trading data or Google trends index or essential Web news. Finally, we integrate CNN and stacked LSTM architecture for efficient feature selection, detection of unique features in terms of specificity and accurate stock price movement prediction.
In this study, we adopted the CNN and LSTM. The two were adopted because studies show that CNN can automatically notice and extract the appropriate internal structure from a time series dataset to create in-depth input features, using convolution and pooling operations [11,12]. Additionally, CNN and LSTM algorithms are reported to outperform state-of-the-art techniques regarding noise tolerance and accuracy for time-series classification [11,[13][14][15]. Furthermore, LSTM and CNN's amalgamation has previously achieved high-accurate results in areas like speech recognition, where sequential modelling information is required [16][17][18]. Lastly, CNN and LSTM algorithms are competent and capable of learning dependencies within time series without the necessity for substantial historical time series data. Also, lesser time and effort in terms of their implementation [13,14,19].
The contributions of the current study to literature can be summarised as follows: 1. A hybrid deep neural networks predictive framework built on CNN and stacked LSTM (named IKN-ConvLSTM) machine learning algorithm; that fuses six heterogeneous stock price indicators (users' sentiments (tweets), Web news, forum discussion, Google trends, historical macroeconomic variables, and past stock data) to predict future stock price movement. 2. We propose a reduction in the data sparsity problems and use the harmonies among stock-related information, by exploring the association among these information sources with deep neural networks. As an alternative to a simple linear combination of the stock-related information, we consider the combined effects among information source to capture their associations. 3. We explored the ideology that traditional technical analysis combined with investors and experts' sentiments or opinions (fundamental analysis) will give better stock price prediction accuracy. 4. We evaluated the effectiveness of the proposed framework experimentally with realworld stock data from the Ghana stock market and compared it with three baseline techniques. The results show that the prediction performance of machine learning models can be significantly improved by merging several stock-related information.
We organised the remainder of this paper as follows. In "Related works" section, we present pertinent literature on stock market analysis. Section "Methodololgy" shows the procedures and techniques applied for combining six heterogenous stock-related information source and analysing their impact on predicting the stock market. We summarised the results and discussion of this study in "Empirical Results and Discussions" section. Finally, Sect. Conclusions shows the conclusions from this work.

Related works
Recently, countless studies have been reported in the literature from journals, conferences, magazine, and many more on stock market analysis. Succinctly, 66% of these studies utilised historical stock price (Quantitative), 23% qualitative (textual) dataset to predict the future stock prices with various models [2]. The following section presents some recent and relevant literature; we categorised them based on the dataset-type (quantitative, qualitative and both).

Studies Based on Quantitative Dataset
A predictive model based on Deep Neural Networks (DNN) for predicting stock price movement using historical stock data was presented in [17]. The proposed techniques perform favourably compared with traditional methods in terms of prediction accuracy. In the same way, Stoean et al. [20] applied an LSTM based predictive model to predict the closing-price of twenty-five (25) firms enlisted on the Bucharest Stock Exchange, using historical stock price. Notwithstanding the achievement recorded by authors, they acknowledged in their conclusion that the fusion of multiple stock price indicators can improve prediction accuracy. Also, a deep learning predictive framework using CNN and Recurrent Neural Networks (RNN) for predicting future stock price was proposed in [21]. The study reported some improvement in prediction accuracy when compared with analogous earlier studies.
Selvin et al. [22] implemented an LSTM, RNN and CNN based predictive framework for stock price prediction using historical stock prices as input parameters [22]. The proposed system successfully identified the relation within a given stock dataset. Yang et al. [23] proposed a multi-indicator feature-selection for CNN-driven stock index prediction based on technical indicators computed from historical stock data. The study outcome showed a higher performance of proposed deep learning technique than the benchmark algorithms in trading simulations. Additionally, Hiransha et al. [23], proposed a stock market predictive framework based on deep-learning models, like Multilayer Perceptron (MLP), RNN, LSTM and CNN, using past stock data as input features. Their results compared with AutoRegreesive Integrated Moving Average model (ARIMA) showed a higher performance of DNN over ARIMA. The reported outcomes of DNN in market analysis create an excellent platform for additional studies in a wide range of financial times-series prediction based on deep learning approaches. An enhanced SVM ensemble with genetic algorithm predictive model based on historical stock price was presented in [24]. The study outcome revealed that ensemble techniques offer higher prediction accuracy.
However, as mentioned earlier, the historical stock price is limited in disclosing all information about a firms' financial status. Also, as indicated in Zhou et al. [25], stockprices are highly unstable; hence, using technical indicators only cannot exclusively capture the precariousness of price movements. Furthermore, the theory of behavioural finance shows that the emotions of investors can affect their investment decision-making [26]. Hence, unstructured stock market data enfolded in traditional news and social networking sites can serve as complementary to quantitative data to enhance predictive models, specifically in this age of social media and information technology.

Studies based on qualitative dataset
The effects of sentiments on stock market volatility have received recent attention in the literature [27][28][29][30][31][32]. One core source of information for sentiment analysis is the news articles [27,28] and the other commonly used data source is the social media [33][34][35][36]. Using a Support Vector Machine (SVM) and Particle Swarm Optimisation (PSO), Chiong et al. [31] proposed a stock market predictive model based on sentiments analysis. The study recorded a positive association between stock volume and public sentiment.
Similarly, Ren et al. [37] predicted the SSE 50 Index with public sentiment and achieved an accuracy of 89.93% using SVM. Likewise, Yifan et al. [38] examined the predictability of stock volatility based on public sentiment from online stock forum using RNN. They reported a positively high correlation between public sentiments and stock price movement. A combination of three predictive models, namely SVM, adaptive neuro-fuzzy inference systems and Artificial Neural Networks (ANN) was proposed for stock price prediction, using public sentiments [39]. Evaluation of the proposed model with historical stock index from the Istanbul BIST 100 Index yielded promising results. Maqsood et al. [40] examined the predictability of stock price movement from four countries based on sentiments in tweets and reported a high association between stock price and tweets.
The quest for improvement in prediction accuracy has led to the examination of additional data source lately. The following studies [9,[41][42][43] probed the effect of web search queries on stock market volatility and reported that web search queries could effectively predict stock price volatility. However, search queries are limited to territory where the user is searching from; hence its effects on stock price movement cannot be generalised.
The limitation of previous studies discussed above is that they relied only on a single stock-related data source, which, according to [8] limits predictive power.

Studies based on both qualitative and quantitative datasets
The combination of different data sources to enhance the prediction accuracy of predictive models has increased in recent studies. The combined effect of a user's sentiments from social media and Web news on stock price movement was examined [1]. The study achieved prediction accuracy between 55 and 63%. Also, the authors reported a high association between stock price movement and public sentiments. Also, Zhang et al. [7] proposed an extended coupled hidden Markov stock price prediction framework based on Web news and historical stock data. In [8], the authors proposed Multi-source multiple instance learning framework, based on three different data sources. The study recorded an increase in accuracy by the multiple data sources compared with distinct sources. Table 1 shows a summary of pertinent works that sought to examine the collective influence of different stock-related information sources on stock price volatility. We examine these studies based on the number of data source, the technique used, the origin of stock data and reported results. Table 1 affirms earlier report [2] that for every 122 studies on stock market prediction, 89.38% uses a single data source, while 8.2 and 2.46% use 2 and 3 data sources respectively. Then again, as pointed out in the same report, a comprehensive stock market prediction framework should capture all possible stock price indicators that influence the market. Also, in a review paper [10] on the responses of the stock market to information diffusion, explicitly acknowledged that the accuracy of predictive models in the stock market analysis had improved significantly in recent years. Despite that, there is room for further enhancement, by discovering newer sources of information on the Internet to comprehend the existing. Additionally, Pandurang et al. [46] pointed out that different data amalgamation strategies are future directions for better stock market predictions.
Therefore, a holistic fusion of several quantitative and qualitative stock-related data sources to predict the future stock price is a potential way to improve prediction accuracy [2,10,[46][47][48], which remains an open research area. Hence, this study put-forward a novel multi-source data-fusion stock market predictive framework built on a deep hybrid neural network architecture (CNN and stacked LSTM) named IKN-ConvLSTM, to produce a more reliable and accurate stock price prediction. On the other hand, different from previous works that commonly exploit single or dual or triple data source, our proposed framework effectively integrates six (6) heterogeneous stock-related information sources.

Methodology
Our objective is to enhance the prediction accuracy, using both quantitative and qualitative stock-related information as input features to a hybrid DNN architecture. We present in detail the methods and techniques used in this study under this section. Figure 1 shows the process flow of our proposed IKN-ConvLSTM framework for predicting stock price movement. The framework follows five (5) steps: datasets download, data preparation, data fusion, machine learning model, and model evaluation. Details of our framework are explained below. Figure 2 shows the used data sources in this study. All datasets for this study was download from January 3, 2017, to January 31, 2020.

Quantitative dataset
As shown in Fig. 2, three quantitative datasets were used in this study; namely, historical stock data (HSD), macroeconomic data (MD) and Google trends index (GTI).
The historical stock price data of two companies listed on the GSE was downloaded from (https ://gse.com.gh). We selected these companies because they had minimal missing values in their dataset (HSD). Also, these companies were more discussed in the news and social media platform, which gave the researchers adequate qualitative information on them. Each dataset had ten (10) Table 2. Similar to several studies [29,32,[49][50][51], we aimed at stock returns R sk d as defined in Eq. (1). Therefore, we normalised R sk d to reflect the stock-price change compared with the day-before price. We denormalised our model output to get the real-world stock price as expressed by Eq. (2). If R sk d > 0 it implies a rise in (d) day's closing price (denoted as 1) and if R sk d < 0 it represents a fall in (d) day's closing price, denoted as 0 defined in Eq. (3).
where stock_price (d) = closing price at day (d) Previous studies have shown that fundamental macroeconomic such as inflations, price level, interest rate, exchange rate and composite consumer price are good indicators for stock price movement. Therefore, similar to these studies [44,52], we downloaded forty-four (44) economic indicators from the official websites of the Bank of Ghana (www.bog.gov.gh) for 744 trading days. Table 2 shows the details of the macroeconomic variable used in this study. Study shows that the accuracy of deep learning algorithms is deeply affected by data quality [3,12,53,54]. Therefore, for better performance of our model, we replaced any missing value of specific MD feature on a day (d) with x (di) as defined in Eq. (4). The dataset was normalised in the range of [− 1,1], using Eq. (5). We save each qualitative dataset separately in a CSV file.
where x (d− 1) = specific MD feature value on the previous day and x (d+1) = value on a day after missing day where ( x newi ) is the normalised feature, x ioriginal = the original values of feature (x), x i and σ are the mean and standard deviation of the dataset (x). Google trend is a service provided by Google, which enables anyone to find out the volume of search on any topic. The search volumes are usually scaled within [0-100], where 100 represent the highest search volume for any given day and 0 the lowest. A total of 221 records were obtained from Google Trends, thus, 221 × 1 matrix, and we normalised the dataset in the range of [− 1,1] as defined in Eq. (5). The trend search for this study was restricted to only the two companies of focus. Google trend was considered as a potential input because studies show that it can effectively communicate the future volatility of the stock price [41][42][43].

Qualitative (textual) dataset
Three qualitative datasets, as shown in Fig. 2, were used in this study, namely tweets (SM), web financial news (W) and forum discussion (FD). The tweets used in this study were downloaded from Twitter, using the Twitter API Tweepy [55]. Moreover, like many works in literature [33][34][35][36], we used the dollar ($) sign as a means to obtained 1,101 stock market-related tweets and all other tweets concerning our selected companies. Business news, financial news and events headlines concerning our selected companies we downloaded from three popular news sites in Ghana, namely, ghanaweb.com, myjoyonline.com and graphic.com.gh using the BeautifulSoup API. A total of 251 news articles were downloaded. However, unlike previous works [8,27,28] which considered only the sentiments in news titles, this study considered the spread of the news among the public and counts of comments made by the public on a news article on the same day. Thus, we excluded comments, and shears counts made any day after the day the news article was published. The reason is using the number of comments and share on an article days after its publication could lead to the use of information occurring after the stock price movement has already taken place. We extract the actual sentiments in the news titles using the Natural Language Toolkit (NLTK) [56].
We obtained our forum discussions dataset from sikasem.org. We use the sentiment analyser [56] to obtain the collective sentiments from the forum messages. All our qualitative datasets were tokenised, segmented, normalised, and freed from noise. Thus, texts were chopped into smaller pieces, called tokens while throwing away certain characters such as punctuation, symbols (URLs, /,?,#, @), extra spaces and stop words like "and, " "a" and "the", using the NLTK. We assessed the sentiments in the textual datasets (tweets, news, and forum discussions) in two dimensions, polarity score within the range [− 1.0, 1.0] and subjectivity within the range [0.0, 1.0], where 0.0 is considered to be very objective and 1.0 as very subjective [9]. We also considered diffusion of a tweet and forum message by considering retweeting of a tweet and number of comments made on a forum post. We stored each processed textual data in a separate comma-separated values file for further processing.

Data fusion
The fusion stage aims to integrate the six (6) datasets discussed above, based on stock ID and stock price date. Finally, to make use of theses six heterogeneous sources, we put-forward a feature fusion framework (Fig. 3) to combine all features using (Algorithm 2, Appendix A). We considered each data source as independent of the another. The stock returns labels are denoted by y = y d , where y d represents the stock return class on a date (d). Let vector ϕ holds the final amalgamation of the six defined vectors above. We apply a strategy for merging all feature from the six data source as a single vector, which can be defined as ϕ d = {β i } d , i ∈ 1, ..., d , where (β i ) is the combination of six data source observed on the day (d + 1), Then the prediction problem can be modelled mathematically as a function f (ϕ) → y t+d . Thus, the combined dataset could be expressed as ϕ d ∈ ℜ M×N , where N is the total number of features (N = 70 for this study) as shown in (Table 6, Appendix A), M is the number of records. Table 2 shows the breakdown of the initially features of our integrated dataset. The final dataset was a matrix of size 193 × 70.
Model design Recently, deep learning techniques have gained unprecedented popularity, and several accomplishments can be found in the literature [12,13,54,57]. Therefore, in this study, we introduce a CNN figuration as a feature selection mechanism to select the features that are most significant to feed our LSTM classifier. The following section gives details of the proposed hybrid predictive model.

Feature engineering with CNN
Almost every machine-learning model is integrated with feature selection, to eliminate redundant and irrelevant features among datasets for higher performance in terms of prediction accuracy and computational time [29,32,51]. Lately, the CNN algorithm is one among deep learning techniques used in feature selection and extraction [11,58]. Currently, literature has shown a promising performance of stock market predictive models built on the CNN algorithm [59]. In this paper, a CNN with 1 Convolutional Layer (CL), two dense layers and a MaxPooling was implemented to perform a random search feature selection. This network has 64 filters and kernels of size 2. We placed a pooling layer with max-pooling function (MPF) and ReLU activation to extract unique features after the CL. The MPF layer addresses the essential features by pooling over any feature map bearing a close similarity to the practice of feature selection in finding investments patterns. The ReLU activation function was adopted in this study for its easy implementation and vantage of nimbler convergence. Finally, two dense layers with ReLU and Sigmoid respectively are placed after a flatten layer. We adopted a simple and straight-forward criterion proposed in [60], to detect which features are to be selected or removed. The process utilises the accuracy obtained by the network on the training dataset. Thus, assumed a trained CNN network (N) with input data (g) of (d) dimensional of features, g → g 1 , g 2 , ..., g n . The accuracy of (N) is calculated with one less feature, using the cross-entropy error function (Eq. 6). At the same time, a penalty term measures the complexity of the (N). Thus, the set g − g k , for each k = 1, 2, ..., n is the input feature set. We then calculate the accuracy (A) by simply assigning the connection weights from the input feature g k of trained (N) to zero (0). Afterwards, we ranked the obtained accuracies of each (N) with g − g k features, and based on the network having the maximum accuracy, the set of features to be reserved is searched. The steps for the CNN feature selection are detailed in algorithm 1.
where k=number of patterns t i p = 1 or 0 and is the target value for pattern x i and the output unit p, p=1,2,…C, C=number of output units, S i p = the output of the (N) at unit p The acceptable maximum drop in the accuracy rate (�M) on the (Ds CV ) set was set to 2%.

LSTM classifier
In this stage, we introduced a special RNN named LSTM for predicting the stock price movement. The LSTM was invented to solve the overfitting problem of the simple RNN [17,18,52]. Figure 4 shows an elaborate scheme of a single LSTM block architecture.
The significant element of the LSTM is the cell state, (C t ) , which is regulated by three different gates, namely forget-gate f t , input-gate (i t ) and output-gate (o t ) . The main computation of the LSTM is as defined in Eq. 7 -14 [12,17,18,22,52]. The forgetgate decides to keep or throw away a piece of information from the previous cell state (expressed in Eq. 7) using a sigmoid function (Eq. 8), f t ∈ [0, 1].
The i t expressed by Eq. (9), determines which values of the cell state are restructured by an input signal, based on sigmoid function, and Hyperbolic tangent (tanh) layer (Eq. 10) and create a vector value C t (expressed in Eq. 11). i t ∈ [0, 1] The output-gate (o t ) expressed by Eq. (13), permits the cell state either to affect other neurons or not. This is achieved by passing the cell state through a tanh layer and multiply it with the outcome of the output gate to get the ultimate output (h t ) defined by Eq. (14). In this study, we designed a stacked-LSTM network (Fig. 5), which comprised (L 1 and L 2 ) to predict stock price movement from the optimised features by the CNN model. We implemented (L 1 and L 2 ) with different size, with (L 1 > L 2 ) , a practice  [61] for detecting unique features in terms of specificity. By this, (L 1 ) is aimed at recognising general features while (L 2 ) is aimed at specific features. Knowing that the complexity of LSTM is influenced by the input data size and time steps, we designed (L 1 ) to accommodate 40 LSTM blocks. Each block linking to a timestep in our dataset to be supplied into our predictive network and (L 2 = 20 blocks). The preprocessed data from the CNN model is transformed into a 3-dimensional matrix x ∈ ℜ l×m×n ,where l = batch size, m = sequence length and n = features and fed into (L 1 ). The output (h(L 1 )) of (L 1 ) is forwarded to (L 2 ) and the output (h(L 2 )) of (L 2 ) is passed through a SoftMax-layer (SL) (defined in Eq. (15) and (16)) to transform the output into two class probabilities (Y ∈ [1, 0]): We adopted Adam (Adaptive Moment Estimation) with the initial learning rate of 0.001 to train our network. The Adam combines the strength of 2 other optimisers, namely ADAgrad and RMSprop. The grid search technique was used for hyperparameters tuning, where numerous amalgamations of hyperparameter values were tried, and the best amalgamation adopted. Table 3 gives a summary of the optimum hyperparameters used in each NN layer in this study. Only ten epochs were used in our LSTM training as the dataset (no. of records) was very small.

Evaluation metrics
In examining the performance of our proposed stock prediction framework, we adopted the Accuracy (Eq. 17), Specificity (Eq. 18), F-score (Eq. 19) and Sensitivity (Eq. 20) metrics, based on their suitableness for measuring the performance of a classification model as indicated in [2,62]. Accuracy gives a measure of the correctly classified samples to the total number of samples. Specificity estimates the classifier's capability to correctly identify negative labels while sensitivity (also known as recall) (15)

Empirical Implementation
A practical implementation of the proposed predictive framework (IKN-ConvLSTM) was carried out to assess its performance. The computer used was an HP laptop (Spectre × 360) computer 8th Generation Intel ® Core ™ i7 processor 16.0 GB RAM. We implemented our model with the Keras library, which supports both the Graphics Processing Unit (GPU) and the Central Processing Unit (CPU). The framework was coded in a modular fashion using Python programming language with Jupyter notebook. We also made use of the numerous modules in Keras such as cost functions and optimisers for implementing deep learning algorithms. To obtained an optimal data portioning of our integrated dataset discussed in Sect. 3.3, we adopted the in-sample and out-of-sample test technique, and the optimal split was training (75%), and testing (25%). Based on the training and testing dataset, we trained and tested our proposed model using the optimum hyperparameters. Table 4 shows a summary of our CNN features selection model.

Empirical results and discussions
Feature engineering by CNN Figure 6 shows the accuracy of twenty (20) iterations of different randomly selected features by our CNN model. We observed that 21 features gave an accuracy of 82.52%, while 52 features recorded an accuracy of 81.06%, as shown in Fig. 6. However, the combination of 62 features measured an accuracy of 88.75%, which was the best combination by the CNN model. Nevertheless, another combination of 60 features recorded an accuracy of 81.97%. Thus, a difference of 6.78% in accuracy between 60 and 62 features. Thus, this outcome points out that the performance of a machine learning model does not depend on the quantity of input feature, but the quality of the input features. Thus, it can be inferred from the outcome that combining the right stock price indicators out of the numerous indicators from different stocks related data sources is a good phenomenon for higher prediction accuracy. Thus, not just amalgamation of several features increases prediction but the right ones. Furthermore, this outcome affirms the importance of feature engineering in a machine learning framework, as indicated in [29,32,51]. Based on these outcomes, it can be established that the CNN networks are enough and efficient for automatic selection of features from heterogeneous stock data for effective stock price prediction.  The optimised parameter from our base CNN was fed as input to our stacked LSTM model. Details of the best 20 pairs of features and their accuracies recorded by the CNN model is given in Table 7 (Appendix A).

Training and testing results based on the optimised features
The proposed predictive framework was training and tested using the accuracy and loss metrics. The accuracy in this study signifies the number of data samples whose labels were correctly classified by our predictive model, measured as already expressed in Eq. (17). The loss here signifies an error, which indicates how close the predicted values ŷ are to the actual label y . Figure 7 shows a plot of how the proposed predictive framework performed during training and testing over ten epochs, based on optimised fused features from the CNN base model. From Fig. 7, it can be observed that the training accuracy progressively upsurges and converges around 98.526%, while the testing converges around 98.307%.
The progressive rise in the training accuracy of the proposed predictive framework shows that our stack LSTM classifier acquires better-quality optimised parameters over individual epoch till convergence. Also, the high training accuracy (98.526%) achieved at convergence suggests that the first phase (LSTM1) of our proposed stacked LSTM networks was capable of automatically detecting unique features within the 62 input features. Furthermore, the simultaneous progressive rise in both training and testing accuracies points out that the trained predictive framework is not having a variance problem. As an alternative to viewing the performance of the proposed framework, Fig. 8 shows the training and testing losses. Subsequently, the smaller loss values recorded during training and testing show the efficacy of the proposed model. Thus, the lesser the loss value at convergence, the better a model is since loss signifies a measure of error. At convergence, training and testing loss were 0.09264 and 0.04958, respectively. Figure 9 shows a plot of all textual dataset (SM + W + FD) put together as (Unstructured Dataset) and all numerical (MD + HSD + GTI) put together as (Structured Dataset) and a combination of both as (All Combine). We aimed at exploring in details the ideology that traditional technical analysis combined with the sentiments or opinions Fig. 7 Training and testing accuracy of proposed framework of investors and experts (fundamental analysis) will give better stock price prediction results.
As shown in Fig. 9, the unstructured dataset achieved a convergence accuracy of 74.69%, whiles the structured dataset achieved 95.78% and combined dataset 98.526%. This outcome confirms two opinions in literature. Thus, (1) the difference in accuracy (21.09%) between structured dataset (95.78%) and unstructured dataset (74.69), affirms that the unstructured stock dataset from social media and the Internet are best for argumentation of historical or structured stock dataset to enhance prediction [2,5]. (2) also, an increase in accuracy of combine dataset compared with the individual (structured and unstructured), supports that a combination of stock-related information has the propensity of improving stock prediction accuracy as pointed out in [2,10,46,47]. Hence, it cannot be overlooked in designing stock prediction frameworks and models. However, we observed that the accuracies of the structured and combined datasets were initially close to each other. However, the gap widens as the epochs increased. Table 5 shows the experimental results for specificity, F-score and sensitivity (recall) of the proposed predictive framework. The results (Table 5) show the effectiveness of the  proposed predictive model to correctly identified positive and negative labels. However, from Table 5, it is evident that the neural NN model handles negative label labels a little better than positive labels.
Ref. [1,7] combined two data sources to predict stock price movement and reported an accuracy of (52-63) %. Also, in [9], three data sources were joined to predict future stock price and achieved prediction accuracy (70.66-77.12)%. In comparison, the current study achieved a prediction accuracy of 95.78% with a combination of six different data sources. The outcome suggested that the accuracy of the stock market prediction can be improved further with data source fusion.  The accuracy of IKN-ConvLSTM outperformed the MLP, SVM, and DT models by 7, 24 and 13% respectively. It indicates that classical classifier models such as DT and SVM cannot effectively extract hidden features in input parameters. Besides, the overfitting problem may occur in training the DT and SVM models owing to the insufficient amount of data used in this study. In contrast, the ability of deep learning models to shear knowledge among nodes (neurons) can reduce the influence, as shown by the proposed deep learning framework.

Conclusions
Previous studies [1,7,8,25,[43][44][45] have attempted to examine the joint impacted of different stock-related information sources for predicting stock price movement, a high percentage (63%) of these studies employed 2 data sources. In comparison, 37% used 3 data sources (see Table 1). However, current studies [2,10,46,47] on stock price prediction acknowledge that the combination of different stock related data sources has the potential of recording higher prediction performance. However, literature shows that as datasets are becoming bigger, complex and more diverse, there is a big challenge to integrate them into an analytical framework. Besides, if this is overlooked, it will create gaps and lead to incorrect communications and insights. Hence, in this study, a novel framework called IKN-ConvLSTM was proposed. The model was based on a hybrid deep neural networks architecture of a convolutional neural network and long shortterm memory to predict stock price movements by using a combination of six heterogeneous stock related data source. Using a novel combination of random search technique and a CNN base model as a feature selector, we optimised our initial training parameters of 70 heterogeneous stock related features from six different stock-related information sources. The final optimised parameters fed into a stacked LSTM classifier to predict future stock price. Our CNN model selected sixty-two (62) features with an accuracy of 88.75%. Which shows that the combination of CNN network and random search technique is useful for automatic feature selection from raw stock data, avoiding the need for manual feature selection in predicting stock price movement. Thus, the random search was found to be a powerful tool to perform feature selection. Stock price prediction accuracy (98.307%) achieved by our proposed stacked LSTM classifier with 62 different input features, shows that the accuracy of stock price predictive framework can be effectively enhanced with data fusion from different sources.
To the best of our knowledge, this study is the first to fuse six heterogeneous stock related information source to predict the stock market. Even though our proposed unified framework recorded satisfactory prediction performance, it still has some weaknesses. First, our framework has many parameters (62) which resulted in training time and computational resources, due to the nature of the deep neural network, compared to other methods. Secondly, though our dataset had a good number of parameters because of the data fusion introduced in this study, the size (volume) of textual data on the stock market in developing economy is scanty, which limited the prediction window of this study to only 30 days ahead. Also, much time was spent by researchers in integrating the six data sources as a single data, because they were of different formats and not in the same sequence. Again, removing comments made on news articles a day after the news was made manually taking much time. Therefore, future works could automate this process and introduce some data argumentation techniques such as Generative Adversarial Networks (GANs), Autoencoders to enhance the current framework.