Multi-criteria collaborative filtering recommender by fusing deep neural network and matrix factorization

Recommender systems have been an efficient strategy to deal with information overload by producing personalized predictions. Recommendation systems based on deep learning have accomplished magnificent results, but most of these systems are traditional recommender systems that use a single rating. In this work, we introduce a multi-criteria collaborative filtering recommender by combining deep neural network and matrix factorization. Our model consists of two parts: the first part uses a fused model of deep neural network and matrix factorization to predict the criteria ratings and the second one employs a deep neural network to predict the overall rating. The experimental results on two datasets, including a real-world dataset, show that the proposed model outperformed several state-of-the-art methods across different datasets and performance evaluation metrics.

only on interactions between users and items such as ratings [3]. Matrix Factorization (MF), is the most popular CF techniques, which maps users and items into a joint latent space, using a vector of latent features to represent a user or an item [4]. Then the interaction of a user on an item is obtained from the inner product of their latent vectors. Although MF is effective, the simple choice of the interaction function is insufficient to model the complex relation between the users and items.
Deep Learning (DL) has achieved immense results in many research fields, such as speech recognition, natural language processing, computer vision, and recently in recommender systems, where Deep Neural Networks (DNNs) have proved their capability of modeling nonlinear users and items relationships and approximating any continuous function [5].
Therefore, it is convenient to fuse the DNN with MF to formulate a more general model that makes use of both the nonlinearity of DNN and the linearity of MF to enhance recommendation accuracy.
The objective of this paper is just to propose a multi-criteria collaborative filtering recommender by fusing DNN and MF, to improve collaborative filtering performance. Our model consists of two parts: in the first part, we get the users and items' features and feed them as an input to a fused model of a DNN and MF to predict the criteria ratings, and we use a deep neural network to predict the overall rating in the second part.
By doing experiments on two datasets including a real-world dataset, it can be observed that our proposed model achieves significant improvement compared to the other state-of-the-art methods.
The main contributions of this work are as follows: 1. We present a multi-criteria collaborative filtering recommender by fusing DNN and MF, to combine the non-linearity of DNN and linearity of MF in a multi-criteria RS for modeling user-item latent structures. 2. We do comprehensive experiments on two datasets including a real-world dataset to show the efficiency of our model and the importance of using multi-criteria and deep learning in collaborative filtering.
The rest of this paper is organized as the following: In Sect. 2, we survey the related work. We will give a detailed overview of the system in Sect. 3. Section 4 presents the experimental evaluations and discussion and in Sect. 5, we provide a brief conclusion and potential future work.

Related work
Multi-criteria recommendation techniques can be divided into two general classes: memory-based and model-based methods. In memory-based methods, the similarity can be computed in two ways: the first approach calculates the similarity on each criteria rating separately using the traditional way and then aggregates the values into a single similarity using an aggregation method such as average [6], worst-case [6], and weighted sum of individual similarities [7]. The second approach uses the multidimensional distance metrics (Euclidean, Manhattan, and Chebyshev distance metrics) to calculate the distance between multi-criteria ratings [6]. Model-based methods use the user-item ratings to learn a predictive model and later use this model to predict unknown ratings, like Probabilistic Modeling [8,9], Multilinear Singular Value Decomposition (MSVD) [10], Support Vector Regression (SVR) [11], and aggregation-function-based [6], this approach assumes that there is a relation between the item's overall rating and the multicriteria ratings and it is not independent of other ratings.
Many efforts have been made to improve MF, Koren [12] merged MF and neighborhood models. Wang et al. [13] combined MF with topic models of item content. Rendle [14] introduced Factorization Machines that combine Support Vector Machines with factorization models. He et al. [15] proposed Neural Matrix Factorization (NeuMF) model that changed the linearity nature of MF by combining it with Multi-Layer Perceptron (MLP).
There are very few researches on applying deep learning to Collaborative Filtering model. Salakhutdinov et al. [16] proposed restricted boltzmann machines model to learn the item ratings correlation, thereafter Georgiev et al. [17] expanded the former work by adding both user-user and item-item correlations. Ouyang et al. [18] used autoencoder in autoencoder-based collaborative filtering to model users ratings on items. He et al. [15] introduced neural collaborative filtering model that uses MLP to learn the interaction function. Nassar et al. [19] presented deep multi-criteria collaborative filtering (DMCCF) model which is the only attempt in applying deep learning and multi-criteria to collaborative filtering. The model follows the aggregation-function-based approach, where they used a deep neural network to predict the criteria ratings and another DNN to learn the relationship between the criteria ratings and the overall rating in order to predict the overall rating.

Method
The proposed model is based on the model that Nassar et al. [19] presented. It contains three steps: a. Predict criteria ratings r 1 , r 2 , . . . , r k using a DNN. b. Learn aggregation function f , which represents the relationship between the criteria ratings and the overall rating using a DNN. c. Predict overall ratings using the predicted criteria ratings and the aggregation function f .
In our model, we used a fused model of DNN and MF to predict the criteria ratings r 1 , r 2 , . . . , r k in the first step, while in the second step, we kept using a DNN to learn aggregation function f , as illustrated in Fig. 1.

Criteria ratings model
This model is used to predict the criteria ratings for a user on an item. In this model, we will fuse MF and DNN as He et al. [15] proposed in their NeuMF framework. MF that applies a linear kernel to model the latent feature interactions, and DNN that uses a nonlinear kernel to learn the interaction function from data.

MF
We can use the item ID or user ID as a feature since it is unique, but the ID is a categorical feature.
Therefore, we converted the IDs into embedding vectors initialized with random values that adjusted to minimize the loss function during the model training.
In the input layer, the input vector x is given by: where v u and v i are user and item embedding vectors and ⊙ is the element-wise product of vectors. The formula of the output layer is as follows: where a out and w T are the activation function and weights of the output layer.

DNN
The input vector x is given by: where v u and v i are user and item embedding vectors. Then followed by a number of dense Rectified Linear Units (ReLU) layers, we choose ReLU activation function because it is the most efficient [20]. The formulation of which is given as follows: The output of a hidden layer l is formulated as: In the model output layer, we predict the user criteria ratings r 1 , r 2 , . . . , r k using Eq. (9) and (10):

Overall rating deep neural network
The overall rating DNN is used to learn the aggregation function f , which represents the relationship between the overall rating r 0 and the criteria ratings r 1 , r 2 , . . . , r k , in order to predict the overall rating: In the input layer, the input vector is the criteria ratings r 1 , r 2 , . . . , r k for user u and item i , as shown in Fig. 2. We normalize the continuous features r 1 , r 2 , . . . , r k because the DNNs are sensitive to the inputs scaling and distribution [21]. The normalization of a sample r i is calculated as: with m i the mean of the training samples for rating r i and s i the standard deviation of the training samples for rating r i . The input vector becomes: Then followed by a number of hidden layers, the output of a hidden layer is given again by Eq. (5).
In the output layer, we predict the overall rating r 0 in Eq. (6), where: We used Adam optimizer [22] and MAE loss function in both parts.

Recommendation
After the finishing of the model training, where each part was trained individually without knowing each other. We can use the model to predict the user's overall rating on the new items. The recommendation process happens as shown in Fig. 1, for each user u and item i pair we: a) Get the user ID and the item ID, and use them as an input for the Criteria Ratings model, to compute the criteria ratings r ′ 1 , r ′ 2 , . . . , r ′ k . b) Normalize the previously computed criteria ratings r ′ 1 , r ′ 2 , . . . , r ′ k and use them as an input for the Overall Rating DNN, to compute the overall rating r ′ 0 . c) Recommend the user the items as in traditional recommender systems using the overall rating r ′ 0 .

Results and discussion
Dataset We evaluated our model on two multi-criteria rating datasets, a real-world TripAdvisor 1 dataset and a Movies 2 dataset.

TripAdvisor dataset
Multi-criteria rating dataset for hotels. It includes an overall rating and seven criteria ratings Value, Rooms, Location, Cleanliness, Check in/front desk, Service, and Business service, the rating range between 1 and 5. Table 1 demonstrates the statistics of the dataset and Table 2    Multi-criteria rating dataset for movies, available on GitHub. It contains four criteria ratings and an overall rating, the rating range between 1 and 13. Tables 3 and 4 demonstrate the dataset statistics and the distribution of the different criteria ratings and the overall rating.

Evaluation
To evaluate the performance of our model we used the same metrics used in [19].
a. Mean Absolute Error (MAE) [23] where r ui the true rating of user u for item i , r ui the predicted rating, and M is the size of the test set. b. F-measure ( F 1 and F 2 ) [23] where P the precision and R the recall. c. Fraction of Concordant Pairs (FCP) [24] where n u c the number of concordant pairs for user u , and we calculate n u d the number of discordant pairs for user u in a similar way.
i, j |r ui �r uj andr ui > r uj (19) [25] is the average of the first relevant item rank for all users.

Settings
We used Keras 3 with TensorFlow 4 as a backend to implement our model.

Criteria Ratings Model Settings
We conducted several experiments to find the optimal parameters for DNN and MF.
• For DNN, we randomly initialized the DNN parameters using a normal distribution with mean of 0 and standard deviation of 0.05. We used Adam optimizer, with 0.001 learning rate and parameters values as provided in [22].
For TripAdvisor dataset, we used batch size of 512 and set epochs to 2. We set each of user and item embedding vector size to 64, and we selected the [128 → 64 → 32 → 16 → 8] hidden layers. For Movies dataset, we used batch size of 64 and set epochs to 4. We set each of user and item embedding vector size to 32, and we selected the [64 → 32 → 16 → 8] hidden layers. • For MF, we tried to find the optimal embedding vector size. As shown in Fig. 3  We initialized randomly the DNN parameters like the previous DNN, using a normal distribution with mean of 0 and standard deviation of 0.05. We also used Adam optimizer with 0.001 learning rate and the same parameters values. For TripAdvisor dataset, we set the epochs to 50, and for Movies dataset, we set the epochs to 100. While for both datasets, we set the batch size to 512. We used [64 → 32 → 16 → 8] hidden layers. Finally, in the output layer, there is 1 neuron, for the overall rating.

Results
We used fivefold cross-validation method, to split the data randomly into 20% test set and 80% training set including a 1% validation set. We repeated this process 5 times and calculated the mean value of each metric. We compared the performance of our model to the DMCCF model [19], then to a single DNN that predicts the overall rating directly where the finest results of this DNN were acquired from settings shown in Table 5. In addition, we compared our model to a number of famous single rating recommendation methods such as SVD [26], SVD++ [12], Baseline Estimates [27], and SlopeOne [28].
This comparison was done on the overall rating. The results are illustrated in Table 6. We can see that our model achieves the best performance on both datasets, significantly outperforming each of DMCCF model, single DNN, and the other state-of-the-art methods on all the evaluation metrics. This indicates the high expressiveness of our model by fusing the non-linear DNN and linear MF models to capture the methods. F1 and F2 of our model are better than the other methods. Our model surpasses the other models in FCP and it exceeds user-item interactions. According to the results, in MAE, our model excelled the other compared methods at MAP. In MRR, our model is the best.

Conclusion and future work
In this paper, we proposed multi-criteria collaborative filtering recommender by fusing DNN and MF. The model consists of two parts: in the first part, we get the users and items' features and feed them as an input to a fused model of a DNN and MF to predict the criteria ratings, and in the second part, we use a DNN to predict the overall rating. Then we demonstrated the efficiency of our proposed model, where the experimental results showed that our model significantly outperformed the other methods on both datasets for all the evaluation metrics, which proved that the application of deep  learning and multi-criteria in collaborative filtering is a successful attempt and it can be enhanced using different deep learning techniques or by building more complex models.
In future work, we will study the use of different deep learning techniques, such as Autoencoder, Convolutional Neural Network (CNN), and Recurrent Neural Network (RNN) in recommendation systems and attempt to further improve the performance of our model, we will also try other feature representation methods precisely to solve the cold start problem by using user and item content features.