An adaptive composite time series forecasting model for short‑term traffic flow

,

and so on.These methods adopt the shallow temporal features and statistical characteristics among traffic flow to achieve forecasting.However, traffic flow always lie on high dimentional and non-linear space.Thus the traditional linear methods would be no longer valid [6].Therefore, some researchers have turned their focus to the models which are better suitable for nonlinear data.Radial Basis Function (RBF) [7] model, Support Vector Regression (SVR) [8] model, K-Nearest Neighbor-SVR (KNN-SVR) [9] model, and Least-Square SVR (LSSVR) [10] model have been used to forecast short-term traffic flow, and these models have good robustness for largescale data regression problems.However, when dealing with large-scale data, these models require significant time and memory.In addition, the forecasting accuracy can be significantly affected by the kernel parameters.
With the rapid development of deep learning and its great success in big data area, a large number of deep learning based models have been carried out for shortterm traffic flow forecasting [11].Compared with traditional methods, deep learning based ones could extract more informative features from traffic flow such as Stacked AutoEncoder (SAE) based traffic data forecasting method [12], Deep Belief Network (DBN) based traffic flow forecasting method [13].These methods could extract and transform the inherent features in the traffic flow by using deep architectures which adopt multi-layer nonlinear processing units [14].To further explore the temporal feature among traffic flow, researchers proposed Recurrent Neural Network (RNN) based traffic flow forecasting method [15], Long Short-Term Memory Network (LSTM) based traffic flow forecasting method [16] and Bi-directional Long Short-Term Memory Network (Bi-LSTM) based traffic flow forecasting method [17].Besides, Graph Neural Networks (GNNs) also have been introduced to formulate new traffic flow forecasting models for characterizing the spatial relationship among traffic flow [18].A series of GNNs based methods such as Spatial Temporal Graph Convolutional Network (STGCN) based traffic flow forecasting method [19], Dual-Channel Spatio-Temporal Graph Convolutional Network (DC-STGCN) based traffic flow forecasting method [20] and Graph WaveNet based traffic flow forecasting method [21].These GNNs based methods could further improve the accuracy of traffic flow forecasting and demonstrate their remarkable advantage of spatio-temporal feature extraction [22].
Since traffic flow has a large time span and a wide coverage area, there always contains different distribution and complicated spatio-temporal relationship among them.For example, congested and uncongested traffic flow belong to different traffic states and exhibit different distributions of traffic flow.Figure 1 presents the traffic flow variation curve of an intersection in Qingdao within one day.In this figure, it can be clearly observed that the traffic flow abruptly increases and remains high during the morning and evening peak hours.While during the non-peak hours, the traffic flow is relatively small.These reflect the distribution and fluctuation of traffic flow in the congested and uncongested states are not the same.Hence, the methods should be able to explore different feature among traffic flow for better forecasting.However, most existing methods treat all traffic flow equally and adopt a single networks data forecasting only.This kind of approaches may ignore the different interactions from different traffic situation, which would decrease the forecasting accuracy.
Therefore, we propose a new adaptive composite framework, named Long-Short-Combination (LSC) for short-term traffic flow forecasting.The LSC framework is composed of two parts: a data forecasting module and an attribute forecasting module.The data forecasting module aims to accurately forecast traffic flows with varying attributes.We first divide the traffic flow into two parts according to their own attribute.And then each part data is extracted by two networks respectively.The attribute forecasting module is used to forecast the traffic attributes at each time point in the future time series and selects the corresponding model for traffic flow forecasting.After dividing the flow reasonably and combining two modules mentioned above, LSC can improve the accuracy of traffic flow forecasting more effectively.
The main contributions of this paper can be summarized as follows: • An adaptive composite framework model is established for exploring different features in different distributions of traffic flow.Through an optional back propagation method, the data can adaptively select a suitable net for training to improve the forecasting effect.• An attribute forecasting module for traffic flow is designed to forecast the traffic attributes in future time series, enabling the selection of different forecasting models for diverse traffic attributes.• Through experiments on two real-world datasets verify the practicability and effectiveness of our proposed method.
The rest of this paper is organized as follows.Related work introduces the relevant work of short-term traffic flow forecasting.The architecture of the LSC model and the computational procedure using the divided data are described in detail in Method.
Experiment and Result Analysis is devoted to data description and some performance measures.The final Conclusion makes conclusions and discusses the future work.

Related work
In this section, we introduce the related work about traffic flow forecasting.From the perspective of technical characteristics, traffic flow forecasting models can be divided into two categories: classical forecasting methods and deep learning models.

Classical forecasting methods
Early learning models usually used short-term time windows or data statistics tendency to estimate future traffic flow.For example, He et al. [23] computed the Hurst exponent to estimate the period of change in traffic flow time series.Meanwhile, Pei et al. [24] investigated the fractal properties of traffic flow by calculating its fractal dimension.
On the other hand, An urban traffic flow forecasting model was established by Moretti et al. [25] through the integration of hybrid modeling techniques, including statistics and neural networks.Lin et al. [26] used Autoregressive Integrated Moving Average (ARIMA) model to achieve traffic flow forecasting by analyzing the data fluctuation characteristics.Yi et al. [27] used k-nearest-neighbor method to identify the optimal smoothing factor and combined it with a generalized regression model to improve the accuracy of traffic flow forecasting.In addition, Guo et al. [28] introduced the Kalman filter to forecast the short-term traffic flow.However, traffic flow is characterized by randomness.If the traffic flow shows irregular variations, the forecasting performance of these methods is not quite satisfactory.In order to solve this problem and discover the patterns from the nonlinear fluctuating historical data, Xiao et al. [29] constructed a dynamic forecasting model of traffic flow using radial basis function (RBF) to forecast the traffic flow of the city road and achieved better results.Liu et al. [9] proposed A hybrid forecasting model based on K-Nearest Neighbor (KNN) and Support Vector Regression (SVR).The KNN algorithm was used to reconstruct the traffic flow data and the SVR was used to forecast the short-term traffic flow.However, traditional machine learning has unsatisfactory performance in dealing with nonlinear data, with large limitations and inaccurate forecasting results.Moreover, Shallower architectures of these models can only extract superficial traffic flow rules, and are not effective in utilizing massive traffic flow.

Deep learning models
With the development of deep learning techniques, several methods have been successfully applied in the forecasting of time-series data.Guo et al. [30] designed a Back Propagation Network (BPNN) for short-time traffic flow forecasting, which became the model basis for many future algorithms.Zhao et al. [31] designed a one-dimensional Temporal Convolutional Network (TCN) model to obtain spatiotemporal information traffic flow.Similarly Yang et al. [32] improved the SAE by incorporating an Empirical Mode Decomposition (EMD) to decompose complex time series, a collection of simple ones in the network forecasting to further improve the forecasting accuracy.Recurrent Neural Network (RNN) is an artificial neural network with hidden-layer-node connections and closed loops, RNN explores the time series input of any length by using internal memory cells.In order to better capture the temporal variation of traffic flow, Ma et al. [33] introduced RNN into traffic flow forecasting and achieve good results.To overcome the vanishing gradient and exploding gradient of RNNs during backpropagation, Zhao et al. [34] used multilayer LSTM networks for traffic flow forecasting, and the experimental results showed that LSTMs are more suitable than other deep neural networks.In order to further extract the temporal and spatial features of traffic flow, Du et al. [35] proposed a hybrid deep learning framework by combining Convolutional Neural Network (CNN) and LSTM for traffic flow forecasting.In addition, Wu et al. [36] proposed an Attention-Based LSTM (Att-LSTM) model incorporating an attention mechanism to further improve the forecasting accuracy.Ma et al. [37] further improved the LSTM network by combining the traditional LSTM network with a Bi-LSTM, and demonstrated the superiority of the model on short-term traffic flow forecasting through experiments.
Due to the complex spatiotemporal dependency of traffic data, further extraction of spatiotemporal features among traffic data is necessary to explore the correlation between traffic data in time and space.Yu et al. [19] proposed a Spatio-Temporal Graph Convolutional Networks (STGCN) to represent the traffic network as a graph and established a model with a complete convolutional structure.In addition, Pan et al. [20] designed a Dual-Channel Spatio-Temporal Graph Convolutional Networks (DC-STGCN) to explore the correlation between the daily and weekly time components, further improving the accuracy of traffic flow forecasting.However, GCN based traffic flow forecasting methods often struggle to capture the short-term and long-term time relationships in traffic flow data.To address these issues, Huo et al. [38] proposed a hierarchical traffic flow forecasting network to further improve forecasting performance.
Traffic data are complex time series.Different underlying features of traffic data has different attributes.Some researchers have considered the differences.Sun et al. [39] proposed framework built upon a multilayer perceptron (MLP), incorporating the principle of residual separation idea and wavelet decomposition techniques.This approach effectively captures various patterns and noise information within the traffic data.Similarly, Fang et al. [40] proposed framework separates the complex traffic data into stable and fluctuating trends.Then, a dual-channel spatio-temporal network is used to independently model these trends.Wang et al. [41] divided the city into grid cells by utilizing the regional attributes to create a graph structure.By using Graph Convolutional Networks (GCN) to capture spatial correlations and Temporal Convolutional Networks (TCN) to capture temporal correlations of the roads, the model improves prediction.In addition, Cai et al. [42] proposed a novel node adjustment mechanism, which increases the number of Radial Basis Function (RBF) nodes in complex scenarios and decreases it under normal conditions.This adaptability allows the model to effectively respond to time-varying traffic states, like in peak and non-peak.Wang et al. [43] designed a model called varying spatiotemporal graph-based convolution (VSTGC) to express detailed features, such as vehicle type, braking state and external variables.
Up to now, different deep learning structures have been adopted to improve the accuracy of traffic flow forecasting.Unfortunately, there has seldom research considered the different underlying features of traffic data with different attributes, especially the data attributes of congestion and non-congestion.Thus ignoring the negative impact of uniform training of congested and non-congested traffic data.In comparison, our proposed framework LSC can explore traffic data with different attributes and improve the accuracy of traffic data forecasting significantly.

Method
To address the interaction problem between different traffic data distributions, we designed the LSC framework, as shown in Fig. 2. The internal modules, combined with the designed selective backpropagation, enable the LSC framework to adaptively select the appropriate model for prediction based on different traffic attributes.
Let a denote the LSC framework transform function, then the forecast process can be represented as: where [x T −n , x T −n+1 , . . ., x T ] T is a sequence of traffic flow observations with n historical time steps as the input of the model and the xT+1 is the forecast value at the next state.

LSC framework
In this paper, the LSC framework is designed to extract and categorize the data features of large and small traffic flows.The LSC framework consists of a data forecasting module and an attribute forecasting module (C model).The data forecasting module consists of a large traffic flow forecasting model (L model) and a small traffic flow forecasting model (S model).The structure of the LSC framework is illustrated in Fig. 2.And the training and testing processes of the entire LSC framework are detailed in Algorithm 1.
Specifically, for the task of traffic flow forecasting, we predict future traffic flow data at a 5-minute time granularity.Prior to training, we classify the data through traffic attribute division, labeling large traffic flows as 1 and small traffic flows as 0. Based on the data characteristics, the data is differenced and normalized before being input into the L, S, and C models, which are constructed based on L-B-LSTM.A selective backpropagation (1)  The time series is an important research subject in econometric.However, using time series data as a sample will violate the assumption of random sampling [44].This is due to the fact that each observation can only be uniquely observed at any given time.If the time series is stationary, the autocorrelation coefficient will rapidly converge to zero with time intervals [45].In this case, the sample drawn can represent the population sample, and the stationarity of the time series can replace random sampling.This can effectively reduce spurious regression and enable the model to perform correct parameter estimation or statistical inference.Therefore, during the data processing stage, the stationarity of time series must be detected and the nonstationary time series must be smoothed.
Taking the one-day traffic flow data of an intersection in Qingdao as an example, the autocorrelation graph is plotted as shown in the left chart of Fig. 3.The decreasing autocorrelation coefficient indicates that the series is unstable.Therefore, we use the difference operation to smooth the time series [46], and it is defined as follows: where X is the time series, t is the number of terms in the series, and k is the order of the difference.For this experimental data we use the first-order difference, and the formula is follows: ( In the right chart of Fig. 3, the autocorrelation coefficient rapidly converges to 0, indicating that a smooth time series is obtained after differential processing.The time series is then normalized by scaling the data and converting the time series to a dimensionless value.This improves the speed of the model in finding the optimal solution as well as the accuracy of the results.The definition of standardization is as follows:

Attribute forecasting module and data forecasting module
The attribute forecasting module is implemented based on the L-B-LSTM forecasting model, as shown in Fig. 4. The LSTM is served as the first layer of the model to extract the hidden features of the time series.Bi-LSTM is applied to the second layer, and the input data dimension of Bi-LSTM is equal to the output data dimension of the first layer of LSTM.LSTM processes the time series in chronological order, but only the historical information is considered while the future information is ignored.As the second layer, the Bi-LSTM can fully learn the effective information extracted by LSTM, and avoid to reduce the forecasting accuracy because of the noise in the original time series data.In addition, Bi-LSTM can effectively extract the forward and backward dependencies in the traffic flow data [47][48][49].
The Bi-LSTM model contains two independent LSTMs as hidden layers with data propagation details shown below.The input sequences [l t−n , . . ., l t−1 , l t ] are input into the next two independent LSTMs in forward and backward order for feature extraction.The forward output sequence − → h t and backward output sequence where W is the weight matrix, b is the bias vector, σ− → h is the forward activation function, σ← − h is the backward activation function, and δ is the connection function.
The LSTM has excellent inference and forecasting ability to obtain contextual information from long-range time series [50][51][52].In L-B-LSTM, LSTM is introduced as the forecasting layer of the model, as shown in Fig. 4. The input of this layer is the output of the Bi-LSTM.
Within each LSTM unit, the current hidden layer state n t is obtained based on the contextual information [h t−n , . . ., h t−1 , h t ] and the previous unit's hidden state n t−1 , which is used to forecast the next traffic flow state x t+1 .By iteratively accumulating useful contextual information, the final forecasting result is obtained.The computational process is expressed as shown in the following equations: To constrain and guide the optimization of the model, the cost function L x t+1 , xt+1 is set as the root mean square error function in the training phase, as shown in the following equation: where xt+1 is the forecast value of the model, x t+1 is the measured value of traffic flow, and n is the number of training samples.Since the traffic attributes are divided into large and small traffic flows, the attribute forecasting is a binary classification task.In this paper the embedding dimension of the L-B-LSTM hidden layer in the attribute forecasting module is set to 128.Afterward, Fully Connected (FC) layers with Sigmoid activation functions are connected to achieve the classification of traffic attributes.These layers enable the network to map the input features extracted by the L-B-LSTM into the desired output classes.The Binary Cross-Entropy (BCE) loss function is selected, which is often the most appropriate loss function for binary classification tasks.The expression can be written as follows: where t represents the measured value and p represents the forecast value.
To forecast the traffic flow, the same embedding dimension of the L-B-LSTM hidden layer is set in the traffic forecasting module to 128, and the traffic features extracted by the L-B-LSTM are then fed into three fully connected (FC) layers.The number of nodes in the final FC layer is reduced to 12 to obtain the future traffic forecasting results. (5) Additionally, we have incorporated dropout method in our model to mitigate the risk of overfitting, thus enhancing the model's generalization ability and stability.

The selectable backpropagation
In order to ensure the correct training of the L and S models and to avoid the interaction between large and small traffic flow data during the training process, we use selectable backpropagation and depicts it visually in Fig. 5, where the blue and orange inputs respectively represent large and small traffic flows.The direction of the arrows represents the data transmission direction.Red arrows indicate that the data classification label matches the assigned model, allowing loss calculation and backpropagation, while black arrows indicate the opposite.After the data are divided and labeled, each forecasting sample contains traffic flow data for 12 time points, and both large and small traffic flows are distributed in each sample.If direct backpropagation is performed directly, the loss of small traffic flow data will inevitably affect the L model, and vice versa.
As a result, the optional back-propagation ignores forecastings on the small traffic flow data in the L model and on the large traffic flow data in the S model, forcing the model to only focus on the classification data corresponding to itself.Specifically, when training the L model, only the large traffic flow data add to the loss, and similarly for the S model, Fig. 5 The selectable backpropagation for the L and S models.Blue and orange inputs represent large and small traffic flow values, respectively only the data classified as small traffic flow increases the loss.This ensures that the hidden parameters in the network are updated properly for both large and small traffic flow data.

Experiment and result analysis
In this section, in order to test the effectiveness of traffic flow classification as well as to enhance the persuasive power of the LSC framework, the forecasting accuracy is used to represent the solving ability in experiments.Mean Absolute Error (MAE), Mean Absolute Percentage Error (MAPE), and Root Mean Square Error (RMSE) are chosen as the evaluation indexes, and the test datasets are collected by traffic fellow detectors at multiple spots in two different datasets.The training procedure is implemented with PyTorch and a GPU (GeForce RTX 3090).The comparison experiments are carried out to compare the framework with other superior models, and the experimental results are analyzed.The best experimental results in the table are bolded, and the suboptimal ones are underlined.This framework addresses the problem of forecasting traffic data at an individual intersection, and does not involve the research of spatial characteristics in traffic data currently.

Data description
Qingdao dataset and PeMS08 dataset are used to conduct the experiments.Qingdao dataset is a month (30 days) of traffic flow data collected by sensors at an intersection of the road network in Qingdao, China, with a data sampling interval of 5 min and 288 sample data collected per day.In the experiment, the traffic flow data of the first 18 days (5184 sampling intervals, 60% of the total data) are collected as the training set, the traffic flow data of the 18th to 23rd days (1728 sampling intervals, 20% of the total data) are used as the validation set, and the others (1728 sampling intervals, 20% of the total data) are used as the test data.
PeMS08 dataset was collected from California by the Caltrans Performance Measurement System (PeMS) at a frequency of sampling every 30 s. PeMS08 contains traffic data from 170 sensors in San Bernardino from July to August 2016.The raw data is aggregated into 5-minute intervals, which means that there are 288 records of sampling time points in a day, with a total of 17857 valid records at each point.In the experiment, three spots are selected to conduct the experiments.And 60% of the total traffic flow data are collected as the training set, 20% is used as the validation set, and the others are used as testing dataset.

Evaluation Indexes
To evaluate the LSC framework performance, Mean Absolute Error (MAE), Mean Absolute Percentage Error (MAPE), and Root Mean Square Error (RMSE) are selected as the evaluation metrics [53], which are commonly selected to assess traffic flow forecasting performance.The specific calculation equations are as follows: In the above three equations, n is the sample size of the data, ĥi represents the forecast traffic flow value, and h i represents the observation value.To evaluate the effectiveness of the C model in classifying traffic data of large and small flows, the F1 score are selected as an evaluation index, which is calculated as follows: where TP, FP, and FN respectively represent the true positive, false positive, and false negative values.precision represents the proportion of correct samples forecasted by the classifier.Recall represents the ratio of all correct cases forecasted by the classifier to all actual correct cases.The F1 score is the harmonic mean of precision and recall, and higher F1 means better model performance.

Framework parameters
The size of the input sequence and number of hidden nodes in each layer are important to the LSC framework for forecasting accuracy.We find LSTM layer widths of 128 nodes are better suited for the L, S and C models.All models are trained using PyTorch 1.9.1.Finally, the L-B-LSTM layers were trained using an SGD optimizer with learning rate 10 −3 , while the fully connected layers were trained using an Adam optimizer with learning rate 5 × 10 −4 .The best-performing model on the validation set is selected to conduct testing and evaluate our framework.

Effect of L-B-LSTM
As shown in Fig. 2, L-B-LSTM is the backbone structure of the LSC framework.To verify the effectiveness, we compare the LSC framework with L-B-LSTM as the backbone with LSC using the LSTM, Bi-LSTM, and Att-LSTM as the backbone.Specifically, in the LSC framework, the features are explored through L model and S model respectively, and the C model is used as the attribute forecasting module.To prove the effectiveness of L-B-LSTM, MAE, MAPE and RMSE are selected as the accuracy evaluation indicators for comparing L and S models in Qingdao dataset (A, B).And F1score is the evaluation indicator of C model.The experimental results are shown in Table 1.It shows that L-B-LSTM has good performance in the traffic forecasting module (L and S models), especially in the MAE evaluation indicator.In contrast, Att-LSTM performs slightly better in certain evaluation indicators.However, the overall performance of Att-LSTM falls (11 slightly short of L-B-LSTM.Furthermore, the attribute forecasting module (C model) is a simple binary classification task based on median value dividing, so several models perform similarly with good classification results.Therefore, L-B-LSTM is selected as the main structure of the model in the construction of the overall LSC framework.
In the LSC framework, the embedding dimension of the L-B-LSTM hidden layer is set to 128.In order to verify the rationality of the nodes number, we choose the hidden layers with different numbers of nodes for training in Qingdao Dataset B. The results are shown in Fig. 6.It shows that the widths of 128 nodes are better suited for the framework.And all evaluation indexes are less than others.
To demonstrate the effect of choosing median as the division standard, we compare the results of using the mean, the mode, the variance and the standard deviation as division standard of the L module, S module and LSC framework in Qingdao dataset B. The experimental results are shown in Fig. 7.It shows that using the median as the division standard performs better than others, both on the all framework and the modules.Because, the median can ensure that the amount of data used for training in the L and S modules is balanced.

Effect of data division
In the model introduction section, it is hypothesized that dividing the traffic flow can effectively avoid the interaction between different attributes of traffic flow, and LSC can effectively extract the hidden features of time series based on the divided traffic data.To verify this hypothesis, we use MAPE as the main evaluation index, and   8, where "self " represents the model itself using the traffic flow time series for forecasting without data division, while "LSC" indicates that the forecasting using the LSC framework after dividing the traffic flow data into large and small flows.It can be clearly observed from the figure that the forecasting performance is significantly improved after dividing the traffic flow data.Among them, the L-B-LSTM model combined with LSC has the best forecasting performance, and the average result is 1 % higher compared with only using L-B-LSTM model.Furthermore, using L-B-LSTM as the backbone network significantly outperformed the other methods compared, with an average improvement of 2.7% , 1.27% and 0.89% higher than LSTM, BiLSTM, and Att-LSTM respectively.This also demonstrates the effectiveness of using L-B-LSTM as the backbone structure of the LSC framework.Therefore, it can be proved that the reasonable division of traffic flow can better

Effect of LSC
To demonstrate the performance of our proposed LSC framework and verify the differences between deep learning and traditional machine learning algorithms, This section compares the designed LSC model with various methods.Since this framework only solves the short-term traffic prediction problem at a single intersection, and does not utilize the road structural to further capture spatial correlations, it is not compared with graph based networks currently.Among the traditional machine learning algorithms include: ARIMA model (a combination of Autoregressive and Moving Average models, which is a traditional time series forecasting analysis method), RBF (Radial Basis Function neural network, a single-layer neural network structure with RBF kernel), and SVR (Support Vector Regression, which applies support vector machines to regression problems).Deep learning algorithms include: BPNN (Multilayer Back Propagation Neural Network, which is the most basic deep learning forecasting model), RNN (Recurrent Neural Network, with good time series information processing capabilities), LSTM (Long Short-Term Memory Network, a variation of traditional RNN, mitigating the phenomenon of gradient disappearance or explosion), Bi-LSTM (bi-directional LSTM, which captures both forward and backward information), L _ BILSTM (has better performance in processing short-time series information), Att-LSTM (retains the intermediate output results of the LSTM encoder for the input sequence, and selects the attention mechanism to selectively learn the input), LSTM-GRU [54] (consists of a mixture of LSTM and Gated Recurrent Unit for better mining of time series features) and PSO-Bi-LSTM [55] (is based on the Bi-LSTM model and is optimized using PSO technique to expand the global search capability of the model).The experimental comparison results on the Qingdao dataset are shown in Table 2, and those on the PeMS08 dataset are presented in Table 3.The average results of Qingdao dataset and PeMS08 dataset are shown in Table 4.
As shown in the tables, it can be evidently found that the deep learning algorithms outperform the traditional machine learning algorithms in exploring traffic flow information.The SVR and RBF are better than conventional ARIMA, which has the lowest precision.Although the forecasting results of deep learning methods BPNN and LSTM are decent in classical methods, they are still less than LSTM-GRU and PSO-Bi-LSTM.And the LSC framework performs better in traffic flow forecasting than other deep learning methods.These demonstrate the effectiveness and feasibility of the proposed framework.
Furthermore, to demonstrate the transferability of the framework, we trained the model on the PeMS08 dataset (E) and adjusted the data dividing value before testing it on the Qingdao dataset (A).As shown in Table 5, the comparison with other models indicates that the LSC model exhibits excellent transferability.It can still perform well even after changing the dataset.
In traffic data, there are significant differences in traffic flow characteristics between weekdays and weekends.It is typically manifested as increased traffic volume, increased congestion, and less distinct morning and evening peak hours.To make further

Conclusion
In this study, we propose an adaptive composite framework LSC for short-term traffic flow forecasting, which aims to avoid the influence of traffic flow with different attributes on the model training and better extract the hidden information in the time series.We divide the processed time series into large traffic flow data and small traffic flow data based on traffic attributes.The LSC model adaptively learns different regression models for forecasting large and small traffic volumes, and selects the corresponding model for each time node in the future time series through an attribute forecasting module.In addition, an L-B-LSTM model with two LSTM layers and one Bi-LSTM layer is selected for the internal structure of the framework, which is combined with an optional backpropagation method to further improve the performance of the composite framework.
Compared with existing methods, the LSC framework achieves superior forecasting performance on the traffic flow dataset.In the future, we will attempt to combine the present framework with the spatial structure of roads to further explore the networkwide features.And we will investigate more advanced traffic attribute classification methods, extending the current binary classification to multi-class classification.Additionally, we will further attempt to capture hidden temporal features during traffic state transitions to further enhance the prediction performance.

Fig. 1
Fig. 1 Traffic flow change curve of an intersection in Qingdao during one day.The x-axis represents time, and the y-axis represents the traffic flow.Red and blue represent congested and uncongested traffic flow, respectively

Fig. 2
Fig.2The LSC framework; denotes the element-wise addition and denotes the element-wise product.The L-model and S-model are used to predict time-series states with different attributes, while the C-model is responsible for distinguishing traffic flow attributes and combining the outputs of the two models

Fig. 3
Fig. 3 Comparative autocorrelation analysis of traffic flow time series at an intersection in Qingdao.The left chart presents the autocorrelation plot of the original traffic flow time series, illustrating the raw data patterns.The right chart displays the traffic flow time series autocorrelation plot after smoothing

Fig. 6
Fig. 6 Effect of the number of nodes in the LSC framework in Qingdao Dataset B. The y-axis represents different evaluation indexes, and the x-axis represents the number of nodes

Fig. 7 Fig. 8
Fig. 7 Effect of Different Divided Methods in L module (a-c), S module (d-f) and LSC framework (g-i) in Qingdao Dataset B. The x-axis represents data division standard, and the y-axis represents different evaluation indexes

Table 1
Effectiveness of L-B-LSTM in LSC in Qingdao Dataset (A, B)The best experimental results in the table are bolded

Table 2
Forecast performances of different models in Qingdao Dataset (A, B, C, D)The best experimental results in the table are bolded, and the suboptimal ones are underlined

Table 3
Forecast performances of different models in PeMS08 Dataset (E, F, G)The best experimental results in the table are bolded, and the suboptimal ones are underlined

Table 4
Average Forecast Performances of Different Models in Qingdao Dataset and PeMS08 DatasetThe best experimental results in the table are bolded, and the suboptimal ones are underlined

Table 5
Verification of model transferability to different datasets by training on the PeMS08 Dataset (E) and testing on the Qingdao Dataset (A)The best experimental results in the table are bolded, and the suboptimal ones are underlined , forecasting experiments are conducted on traffic flow data in Qingdao dataset (A) and PeMS08 dataset (E) on weekdays and weekends.As shown in Table6and Table7, the LSC framework achieves better forecasting performance than other methods both on weekdays and weekends.This further demonstrates the effectiveness of the proposed model. refinements

Table 6
Forecast performances of different models in Qingdao Dataset (A) on weekdays and weekendsThe best experimental results in the table are bolded, and the suboptimal ones are underlined

Table 7
Forecast Performances of Different Models in PeMS08 Dataset (E) on Weekdays and WeekendsThe best experimental results in the table are bolded, and the suboptimal ones are underlined