A hybrid recommender system based-on link prediction for movie baskets analysis

Vahidi Farashah, Mohammadsadegh; Etebarian, Akbar; Azmi, Reza; Ebrahimzadeh Dastjerdi, Reza

doi:10.1186/s40537-021-00422-0

Research
Open access
Published: 15 February 2021

A hybrid recommender system based-on link prediction for movie baskets analysis

Mohammadsadegh Vahidi Farashah¹,
Akbar Etebarian¹,
Reza Azmi² &
…
Reza Ebrahimzadeh Dastjerdi¹

Journal of Big Data volume 8, Article number: 32 (2021) Cite this article

7177 Accesses
28 Citations
Metrics details

Abstract

Over the past decade, recommendation systems have been one of the most sought after by various researchers. Basket analysis of online systems’ customers and recommending attractive products (movies) to them is very important. Providing an attractive and favorite movie to the customer will increase the sales rate and ultimately improve the system. Various methods have been proposed so far to analyze customer baskets and offer entertaining movies but each of the proposed methods has challenges, such as lack of accuracy and high error of recommendations. In this paper, a link prediction-based method is used to meet the challenges of other methods. The proposed method in this paper consists of four phases: (1) Running the CBRS that in this phase, all users are clustered using Density-based spatial clustering of applications with noise algorithm (DBScan), and classification of new users using Deep Neural Network (DNN) algorithm. (2) Collaborative Recommender System (CRS) Based on Hybrid Similarity Criterion through which similarities are calculated based on a threshold (lambda) between the new user and the users in the selected category. Similarity criteria are determined based on age, gender, and occupation. The collaborative recommender system extracts users who are the most similar to the new user. Then, the higher-rated movie services are suggested to the new user based on the adjacency matrix. (3) Running improved Friendlink algorithm on the dataset to calculate the similarity between users who are connected through the link. (4) This phase is related to the combination of collaborative recommender system’s output and improved Friendlink algorithm. The results show that the Mean Squared Error (MSE) of the proposed model has decreased respectively 8.59%, 8.67%, 8.45% and 8.15% compared to the basic models such as Naive Bayes, multi-attribute decision tree and randomized algorithm. In addition, Mean Absolute Error (MAE) of the proposed method decreased by 4.5% compared to SVD and approximately 4.4% compared to ApproSVD and Root Mean Squared Error (RMSE) of the proposed method decreased by 6.05 % compared to SVD and approximately 6.02 % compared to ApproSVD.

Introduction

Due to the high importance of recommender systems in social networks, real life, e-commerce, shopping cart analysis, etc., a lot of research has been done in recent years [1,2,3]. Recommender systems are one of the most popular systems that have attracted the attention of various researchers during the past decade. Recommender systems are used to filter huge amount information, such as users’ cart [4]. Recommender systems are used in a variety of fields such as shops, libraries, restaurants, tourism systems, shopping carts and other environments to provide attractive items such as movie services [5]. These systems play an important role in e-commerce [6]. Due to the huge amount of information that exists, providing the most appealing services with high accuracy and appropriate time is one of the important issues. The service recommender system enables users to review products having features such as product’s name, manufacturer, production date, brand type, and so on. For users who are new and there is not enough information about them in the system (they have cold start problem), the recommender system offers a list of products which are rated by other users [7]. One of the most important challenges of recommender systems is the challenge of user’s cold start [8]. The problem of cold start occurs when the user has no activity or transaction in the system. Due to the cold start problem of users, a variety of recommender systems have been proposed. In general, recommender systems are divided into two categories:

Traditional recommender systems
1. A.
  Content-base filtering [9].
2. B.
  Collaborative filtering [10].
3. C.
  Hybrid recommender systems [11].
Modern recommender systems
1. A.
  Demographic-based approach [12].
2. B.
  Knowledge-based approach [13].

The methods that have been studied by various researchers are collaborative and content-based filtering systems. Content-based systems classify users based on their demographic information. Collaborative filtering systems are one of the most widely used recommendation techniques that offer users the items that have been rated or selected by other similar users [14]. For example, if two users have similar interests and behaviors, they recommend the purchased service system (film) to each other [15]. In this system, unlike content-based systems, similar users are identified and items which are highly rated are offered to them. This method is used to present a list of products to a group of users using data mining (clustering) techniques [16]. Using similarity criteria in collaborative systems to find adjacent users or similar activities is one of the main requirements of making recommendations. Similarity criteria in recommender systems make it possible to identify similar users or services based on their demographic activity, category and information. In this study, similarity criteria were used in collaborative recommender system to offer the similarity level of items that are rated by other users to the new user in different steps.

The most important challenges and problems of online systems are the loss of customers and the lack of attractive products for them. Various methods have been proposed so far to address these challenges, each of which has its drawbacks. Therefore, in this paper, we will present a hybrid method that improves the challenges of other methods. The proposed method is a combination of DNN and DBSCAN clustering algorithm in the CBR core and a combination of hybrid similarity criteria and the new Pro-FriendLink algorithm in CRS.

In general, this study presents a hybrid system based on CBRS and CRS for analyzing user’s cart in an online movie system. In the CBRS, DBScan clustering algorithm and DNN algorithm are used to determine basic categories for users based on demographic information and also to classify new users. One of the most important reasons for using DBScan algorithm for the initial clustering of users based on demographic information is its speed and the ability to support large amounts of information compared to other clustering algorithms. Also, the most important reason for using DNN algorithm to classify new users is its ability to support huge amount of information and hidden layers compared to other methods is classification. The DNN enables new users to be transferred to the target group with high accuracy. The CRS uses a combination of similarity criteria and the improved FriendLink link algorithm to determine the similarity between new users and other users. With the hybrid similarity criteria, the similarity level of users and the new user is calculated in terms of a threshold. The improved FriendLink link algorithm is used to provide friend recommendations based on user communication in online movie system. Therefore, in this paper, a combination of 4 phases is used to analyze the customer baskets. Customer basket analysis is a combination of DNN algorithms and DBScan clustering, which is an innovation in itself. Also, a hybrid similarity criterion and a new improved link prediction algorithm called Pro-FriendLink have been used in the core of the CRS, which have not been used in any paper so far. Therefore, it is one of the most important innovations of the proposed design.

So, the main contribution of this paper is:

The combination of DNN and DBSCAN clustering algorithm in the CBR core.
The combination of hybrid similarity criteria and the new Pro-FriendLink algorithm in CRS.
The proposed Pro-FriendLink algorithm for a new method in RSs.

The remainder of this paper will be presented as follows: “Related works” section reviews the literature, “The proposed method” section describes the proposed approach and architecture, and in “Results” and “Discussion” sections, the results are presented and the conclusions are discussed.

Related works

In this study, for making recommendations in movie systems, several researchers tried to solve the problem of cold start. Kim et al. [17] mentioned the cold start problem concerning movies and users. They introduced an important traditional system of collaborative filtering. In this model, two matrixes of similarity were used, one of which showed the similarity between users and movies and the other one showed the similarity between users themselves. Then, concerning the mechanism of the discussed forecast, they made some recommendations to the users. One of the weaknesses of this study was the high memory usage concerning members (users) and movies which was due to the construction of several similarity matrixes [17]. Bobadilla et al. [18] used the neural network as an RS of the collaborative filtering to reduce cold start issues for new users. They assessed the Movielens dataset and Netflix and due to the usage of non-numeric data, they used Jaccard Similarity Index [18]. Byström [19] recommended movies to users by clustering movies and using k-means algorithm. He carried out it based on users’ comments about movies. Byström studied famous Movielens dataset and implemented the presentation for data collection with 10,109 movies that were assessed by 2113 users [19]. Lika et al. [20] introduced a model in which classification algorithms such as Naïvebays, decision tree, and random classification algorithm were used as similarity metrics in order to recommend movies to users. Also, they evaluated Movielens dataset [20]. In order to enhance the performance of the system and to solve cold start problem, Pereira et al. [21] posed the hybrid method including both collaborative filtering and demographic information. In this study, they used the hybrid co-clustering algorithm and knowing the machine for solving the cold start problem and evaluated Movielens, Jester, and Netflix dataset [21], Sperlì et al. in [22], provided a recommendation system to improve social networking approach. In this paper, an RS which is designed for big data applications is used to provide useful recommendations on online social networks. The proposed technique is a collaborative and user-centric approach that exploits the interactions between users and creates multimedia content on one or more social networks in a new and effective way. Experiments on the data collected from several online social networks revealed the feasibility of the approach regarding the problem of social media proposition. Kutty et al. in [23], presented recommender systems for large social networks: reviewing challenges and solutions. This paper states that social networks are crucial for networking, communication, and content sharing. Social networking applications generate a great deal of information on a daily basis, and social networks are subject to extensive research due to the heterogeneity of data and the structures within them, their size and dynamics. When such a large amount of data is used by recommender systems, the connection result can help to solve social business issues and to improve friends’ recommendations. This paper is a review paper that has compared some trends with each other. Lin et al. in [24], developed a recommendation system based on neural network for recommending movies to users. Due to unimportant challenges like scalability, dispersion and user’s confidence compared with cold start and movies which have been researched till now, the challenges have also been resolved with preprocessing, clustering and classification. Walek et al. in [25] the main objective of this paper to propose a hybrid recommender system predictor for recommending suitable movies. This system contains a recommender module combining a collaborative filtering system, a content-based system, and a fuzzy expert system.

Table 1 summarizes previous approaches to movie recommendation and the cold start challenge. This table outlines the advantages and disadvantages of each method.

Table 1 Summarizes previous approaches to movie recommendation and the challenge of cold start users

A hybrid recommender system based-on link prediction for movie baskets analysis

Abstract

Introduction

Related works

The proposed method

Phase 1: Content-based recommender system (CBRS)

Clustering all users with DBScan algorithm

Separation of training and test samples

Classification of new users with DNN

Phase 2: CRS based on hybrid similarity criterion

Numeric features

String features

Formation of adjacency matrix

Predicting new user’s rating

Phase 3: Improved Friendlink algorithm

Phase 4: Combining link system and recommender system

Results

Dataset

Evaluation metrics

Discussion

Conclusions

Availability of data and materials

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Ethics approval and consent to participate

Consent for publication

Competing interests

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords