Skip to main content

Sentiment-based predictive models for online purchases in the era of marketing 5.0: a systematic review

Abstract

The convergence of artificial intelligence (AI), big data (DB), and Internet of Things (IoT) in Society 5.0, has given rise to Marketing 5.0, revolutionizing personalized customer experiences. In this study, a systematic literature review was conducted to examine the integration of predictive modelling and sentiment analysis within the Marketing 5.0 domain. Unlike previous research, this study addresses both aspects within a single context, emphasizing the need for a sentiment-based predictive approach to the buyers’ journey. This review explores how predictive and sentiment models enhance customer experience, inform business decisions, and optimize marketing processes. This study contributes to the literature by identifying areas of improvement in predictive modelling and emphasizes the role of a sentiment-based approach in Marketing 5.0. The sentiment-based model assists businesses in understanding customer preferences, offering personalized products, and enabling customers to receive relevant advertisements during their purchase journey. The paper’s structure covers the evolution of traditional marketing to digital marketing, AI’s role in digital marketing, predictive modelling in marketing, and the significance of analyzing customer sentiments in their reviews. The Prisma-P methodology, research questions, and suggestions for future work and limitations provide a comprehensive overview of the scope and contributions of this review.

Introduction

The widespread use of artificial intelligence (AI), big data, and the Internet of Things (IoT), among other technologies emerging from Industry 4.0, has given rise to Society 5.0, a societal evolution in which lines between the virtual and physical space are often blurred as technology is increasingly being used to resolve economic and social problems [150]. The changing needs of Society 5.0, in terms of product purchase, gave rise to Marketing 5.0, which has revolutionized the way that products and services are advertised, offering personalized customer experiences [19]. Society 5.0 consumers are empowered by digital technologies that enable them to access a great deal of information about products and services through online reviews. This makes customers more knowledgeable and expects more from sellers than in traditional advertising approaches. According to Zhang et al. ([175], 2), the cost of acquiring new users is 5 to 10 times higher than that of retaining existing users, and a 5% increase in customer retention can increase profits by 25% to as much as 95%.

To retain their customers, businesses are therefore increasingly required to embrace Marketing 5.0 principles by collecting and analyzing customer data prior to sending personalized advertisements based on preferences or purchasing history [91]. Driven by AI and a data-centric approach, Marketing 5.0 practices involve the analysis of previous buying patterns (predictive analytics) and customer feelings (sentiment) at the time of purchase to monitor customer purchasing intentions for more targeted and successful promotion of products and services [156]. In this context, Marketing 5.0 is defined as a tech-enabled and customer-centric approach, whereby advanced technologies are used to gather insights and create personalized experiences for customers [91].

Predictive modelling (PM) makes use of different algorithms such as long short-term memory (LSTM), Bidirectional Encoder Representations from Transformers (BERT), Support Vector Machine or Naïve Bayes) to determine patterns within a dataset and forecast probabilities of events taking place in the future [61]. As claimed by Taherkhaniet al. [147], a prediction model is more sophisticated than current sales approaches, as it offers a better visualization of best-selling products on a dashboard. This results in better positioning within the competitive market by providing customers with products of choice that also meet their needs. Marketing 5.0 has been further enhanced with the integration of sentiment analysis (SA), which is used to determine feelings expressed through textual content or facial expressions [124]. Wang et al. [159] defined sentiment analysis as the identification and categorization of sentiments expressed in a text source, such as comments, product reviews, or news feeds on social media, such as Facebook, LinkedIn, and Instagram.

Regarding literature reviews within the digital marketing field, Al-Sai et al. [13] conducted a systematic review of the structure of big data and its uses in different types of data analysis (e.g., sentiment and predictive). Wang et al. [159] analyzed how natural language processing (NLP) can be used to improve the accuracy of sentiment-based models in different contexts. Moher et al. [105] analyzed the various steps involved in Preferred reporting items for systematic reviews and metaAnalysis protocol (PRISMA-P) in a systematic review. Busalim and Hussin [28] reviewed different deep learning approaches that can be used to improve social commerce today. Guha, Dutta and Paul [53] analyzed different recommender systems available to guide online buyers throughout their purchase journey. For customer-oriented data, textual content, such as customer reviews, is examined mostly to learn about customers’ beliefs, attitudes, and sentiments [174, 176, 178]. These data, including sentiment and historical information, can be used for the analysis of product feedback, enabling the business to better manage its reputation by gathering online feedback through regular web extractions [121]. Shah et al. [134] claimed that reviews help businesses constantly improve their products/services, thereby strengthening customer loyalty. Moreover, PM can be used for better decision-making processes by forecasting the quantity of different categories of products that a business should have on hand[174, 176, 178]. In addition, customer data can be used for targeted marketing to send specific advertisements to a smaller but more specific group of potential customers who might be more interested in a specific product [101]. In their market analysis study, Mehmood et al. [99] claimed that over 54% of small businesses in the United States (US) used social media to acquire new customers, while 80% of the respondents claimed that they experienced an increase in their website traffic/visit when advertisements were posted on social media. Therefore, to understand the different applications of predictive and sentiment analysis within the field of marketing, this study conducted a systematic literature review to analyze how previous studies have used predictive and sentiment models to improve customer experience, business decisions, and any other marketing process that constitutes buyers’ journey.

Contribution of study

This paper highlights areas of improvement in the predictive modelling field and provides a better understanding of the integration of predictive and sentiment analysis within the context of Marketing 5.0. Furthermore, the sentiment-based model will shed greater light on customer preferences, thereby giving businesses insights on customers’ wants and needs by means of customer segmentation [103] and appropriate targeted marketing campaigns [129]. Moreover, such models can help businesses to strengthen their competitiveness by analyzing their competitors’ reviews to identify weaknesses that can be addressed for better business performance [59]. Also, customers could receive advertisements that may facilitate their decision making and purchasing journey [170].

Paper structure

This paper is structured as follows—Sect. "Digital marketing" explains the evolution of marketing from traditional to digital. Sect. "Artificial intelligence (AI) for improved digital marketing" explains the application of AI to enhance digital marketing, while Sect. "Data preprocessing" covers the application of predictive modelling within the marketing field. Sect. "Predictive modelling" explains the use of customer reviews as a means of understanding customers’ feelings (sentiments) during online purchases. The Prisma-P methodology is explained in Sect. "Research methodology" and the various steps involved are covered in the sub-sections. The research questions are discussed in Sect. "Results and findings". Sect. "Conclusion" concludes the paper with an acknowledgement of this study’s limitations and suggestions for future research directions.

Digital marketing

This section explains the digitalization of marketing via the Internet and its impact on buyers’ purchase journeys by encouraging more online sales. The sub-sections explain the contribution of Industry 4.0, technologies (predictive and sentiment analysis) in modernizing marketing processes, and better understanding customer perception during a purchase.

Since its creation in 1983, the Internet has been extensively used worldwide. From 16 million in 1995, the number of Internet users has reached 4536 million in 2021 [45], (3). This was further boosted by the Covid19 pandemic where online platforms flourished because they were an effective way to reach customers during lockdown periods [108]. Many businesses had to shift to online sales using websites or social media to promote and sell their products, leading to a worldwide retail e-commerce sales rate estimate of more than 5.7 trillion U.S dollars by the end of 2023 (Statista, “E-Commerce worldwide—statistics and facts). As customers’ day-to-day interconnectivity increases, it is vital for businesses to understand online customer behavior in an increasingly competitive market [55]. Despite numerous platforms being available for online buying, Kakalejcik, Bucko, and Vejacka claim that “96% of website visitors do not purchase from an online platform during their first visit” (2019, 47–58) because of a lack of online guidance. The buyer’s journey, which comprises three main stages, is illustrated in Fig. 1.

Fig. 1
figure 1

Buyer’s journey (adopted from [149])

In the awareness stage, customers decide to search for products or services online based on their needs or wants. This is followed by the decision-making phase (consideration phase), in which customers look for product and/or seller reviews prior to deciding whether to proceed with their choice or to find another solution. Finally, in the purchasing phase, customers confirm their final choice and finalize their purchases [75]. Not all businesses can track and guide customers at each stage of a buyer’s journey to ensure an enjoyable purchase experience and a high level of customer satisfaction [47]. For instance, businesses often lose customers throughout the journey because of insufficient online customer assistance and interaction channels at the decision-making stage, where customers need to be convinced before proceeding to the purchase of the product [71]. Therefore, the availability of automated responses through online services using artificial intelligence(AI) improves digital marketing [34]. For example, AI chatbots can be used to assist and provide customers with positive reviews posted by previous customers. Furthermore, Rivas and Zhao [127] used ChatGPT-powered models for content creation of advertisements and generated AI posts, claiming that it helped save time so that the business could concentrate on other tasks.

Digital marketing is less time-consuming, as it uses online tools to advertise and persuade online buyers to purchase a specific product or service through business websites and social platforms such as Facebook, Instagram, and Tik-Tok [74]. Mika and Winczewski [102] suggested that redirecting useful content to consumers based on their searches, social media views, or browser activity, and displaying related ads on their screens enables faster and more effective communication with customers through online channels. Furthermore, utilizing AI techniques to extract social data and analyse them could be helpful to provide some better insights to businesses.

Artificial intelligence (AI) for improved digital marketing

The emergence of AI in the era of Industry 4.0, and its application to Marketing 5.0, drove the transition to digital marketing. AI is a game changer for digital marketers who integrate cutting-edge technologies into their marketing plans to boost product visibility online [16]. AI-driven tools provide marketers with knowledge obtained from customer data, such as purchasing history and product reviews, while providing customers with purchasing flexibility. However, this convenience can pose difficulties for buyers who do not have the opportunity to physically examine products before purchasing [68]. Therefore, buyers look for online product reviews and read comments from previous customers before deciding on a purchase. Thus, online customer reviews comprise critical data that are also actively sought and closely monitored by businesses [158].

These data are collated and analyzed to extract key information, such as best-selling products and common purchasing patterns, thereby assisting businesses in stock management [22]. As a result, buyers’ experiences are improved by providing them with the products of their choice based on historical data [141]. With evolving consumer behavior, enterprises are progressively embracing a ‘pull marketing strategy’. This approach encourages customers to proactively seek online experiences through channels such as social media and influencer marketing [174, 176, 178].

For instance, businesses use customer reviews to determine buyers’ emotions during online purchases. Sentiment polarities are represented by positive, negative, or neutral feelings towards a product or service, helping to better understand customer satisfaction [67]. AI also provides voice assistants to simplify the purchasing process [96]. However, customer reviews may also include sarcastic or hateful comments posted by users, which can negatively influence buyers [130]. Therefore, it is important to clean buyers’ data before monitoring customers’ purchasing intentions and forecasting their tentative next purchase using machine learning (ML) algorithms through predictive modelling [42]. The section that follows explains the preprocessing stage before discussing AI techniques used to improve digital marketing namely predictive modelling and sentiment analysis.

Data preprocessing

Before conducting the PM on the dataset, a data-cleaning process is required. This helps prepare and transform the data. For instance, outliers, missing rows of data, and inconsistencies must be identified and handled to avoid affecting the model [24]. Missing values can be replaced with estimates through the imputation process, whereby new variables can be derived or calculated from existing variables. For instance, Hassler et al. [58] used the values of weight, height, and BMI formula to replace missing BMI values in their dataset before analysis. Reducing the number of missing values or creating new ones improves the model’s performance [148].

Furthermore, normalization can also be used to ensure all the feature variables (columns of data) have a specific scale between 0 and 1 (min–max scaling), thus ensuring that all features contribute to the training process of a model. This process can be represented in general using the following formula:

$$\mathbf{X}\_\mathbf{n}\mathbf{o}\mathbf{r}\mathbf{m}\mathbf{a}\mathbf{l}\mathbf{i}\mathbf{z}\mathbf{e}\mathbf{d}=(\mathbf{X}-\mathbf{m}\mathbf{i}\mathbf{n}(\mathbf{X}))/(\mathbf{m}\mathbf{a}\mathbf{x}\left(\mathbf{X}\right)-\mathbf{m}\mathbf{i}\mathbf{n}(\mathbf{X}))$$

X is the original feature vector, X_normalized is the final normalized value of the original vector, min (X) is the minimum value, and max (X) is the maximum value of the feature vector [142].

In some situations, data must be transformed or converted to other data types that the model understands, or into more specific variables. For instance, “time since last purchase” can be obtained if the purchase data and the reference date of a customer purchase are available. Furthermore, if there are skewed data, logarithmic (log (x)) and square root transformation (sqrt (x)) can be used because they help compress larger values and expand smaller [44]. Once the data preparation phase is complete, specific variables (columns of data) must be selected through the feature-engineering process. Variables can be tested individually or against each other to visualize them through Python libraries such as matplotlib and pandas [85]. Data preprocessing and feature engineering phases ensure a good data quality [58]. The next two sections will discuss two Marketing 5.0 AI strategies: predictive modelling and sentiment analysis.

Predictive modelling

The process of creating mathematical or statistical models that indicate future outcomes based on past and present data is known as predictive modelling [72]. This area of data science involves pattern recognition, utilizing data to identify connections and predict future trends [57], assisting companies in taking appropriate actions. Several strategies have been adopted to influence customers’ next purchases. These include cross-selling (CS), which is used to increase the sales volume per customer while maintaining a good customer relationship. For instance, if customers purchase cereals in bulk, the business will advertise products that go well with cereals, such as milk or coffee, to encourage their purchase and reduce the tangible and intangible costs of a customer switching to a different seller [73]. Another common method used to predict customer purchases is the analysis of customer data, such as demographics and purchasing history. The demographic variables age, education, and marital status have an impact on customers’ choices of products and can be used to forecast future purchases [94]. Businesses have also started identifying loyal customers by tracking click-stream data, including clicks and impressions from different platforms [54]. Redirecting useful materials to consumers based on their searches, social media views, or browser activity and displaying ads on their screens enables faster and more effective communication with customers through online channels [111]. The next section explains the different stages involved when conducting predictive modelling.

Stages of predictive modelling

Predictive modelling (PM) consists of a three-staged process: data acquisition, model selection and model testing.

  • Data acquisition: Firstly, an appropriate experimental design is chosen to generate the experimental data. This stage ensures that the correct data are acquired for the study so as to control bias and eliminate inaccurate data [131]. Furthermore, this step outlines the data collection strategies and establishes whether data for analysis will be collected from online social platforms or Kaggle.

  • Model selection: A model is selected to represent the experimental data by comparing different models and determining which one is best suited to the dataset to be tested. This is important because different models have different strengths and weaknesses, leading to inconsistencies if the chosen model is not fully justified [3]. For instance, choosing machine learning models compared to deep learning models could require less training time because of the less complex structures [26], but the accuracy might not be as expected.

  • Model testing: The model was tested on a dataset [72]. After choosing the model, the data must be fed to the model to evaluate its performance using different metrics such as accuracy, precision, and F1 score. This helps to identify the potential problems of overfitting and underfitting to further optimize the model [135].

The chosen model can be represented by f(w,b) (x) = wx + b, where w and b represent the parameters used in the prediction [51]. The cost function (mean squared error MSE), which is the measurement of how far the prediction is from the actual target, can be measured as follows:

$$J\left(w, b\right)=\frac{1}{2n}{\sum }_{i=1}^{n}{\left({f}_{w,b} \left({x}^{\left(i\right)}\right)- {y}^{(i)}\right)}^{2}$$

W represents the coefficient of the model, b is the bias term, f (w,b) (x(i)) is the predicted value for input x(i), y(i) actual value corresponding to input x(i) and n is the number of data points in the dataset [51, 52]. The value of J(w,b) must be decreased to ensure an accurate analysis.

Various methods can be used to conduct predictive modelling. Nonetheless, the data preprocessing described above, along with the three-staged predictive modelling stages, proved to be more systematic when employing pretrained models that required minimal training time [65, 72].

For instance, several researchers [39, 72, 85, 93, 112, 165] have investigated predictive modelling algorithms and techniques. Wen et al. [164] used customer feedback and clickstream datasets from a shopping platform in China to develop a prediction model for customers’ purchasing intentions. They recorded a good predictive F1-score of 0.9031 with Random Forest for an optimal time window of 2 days. Their study helps businesses understand that purchase intention is influenced by fashion products and public reputation established through social media. Khanna and Maheshwari [80] developed a predictive model based on regression and statistical methods to forecast weld bead dimensions using welding data. Furthermore, their mathematical models were able to make a prediction based on different variables by undergoing some data transformation through mathematically derived formulas. This process had a 2% error margin because of the coefficients used during the conformity test. This could have been improved by the use of deep learning algorithms [80], 4481–4483). Rahmani et al. [122] utilized Random Forest and AdaBoost modelling techniques to predict steer prices. Their test data had a confidence interval of 95% [122], 15–16), indicating the effectiveness of the multivariate approach and the effective use of probabilistic modelling for price variability. This has helped businesses minimize the wastage of resources, thus making them more sustainable. However, to improve the reliability of the results, other external factors that affect pricing could have been added to the training dataset to determine new coefficients and relationships between the different variables. Thangeda et al. [152] collected data from Andhra Pradesh Telecom users and used a nonlinear adaptive approach to predict customer churn in the telecom industry. Different trigonometric and linear combinations were used to train the dataset, and they exhibited promising convergence performances. However, the dataset comprised information that were too generic and did not have a structured feature-engineering approach, thus leading to biased results. Therefore, sentiment data (polarity, sentiment scores, feelings, word aspects) could be integrated with purchasing data to provide better insights through a sentiment-based predictive model. The next section explains sentiment analysis and its applicability in the current Marketing 5.0 era.

Sentiment analysis (SA)

Sentiment analysis (SA) is a discipline that studies consumer responses towards goods and services to assess how customers’ feelings are reflected in their purchasing attitudes and product evaluations [20]. Singh and Singh [139] described SA as a study of feelings based on textual information published as online evaluations on various social media platforms. Chan et al. [32] claimed that these evaluations could be used to analyze people’s attitudes, sentiments, and ideas regarding a certain event to understand unstructured data (raw data collected from social media). This process is referred to as the traditional method of conducting sentiment analysis through polarity extraction and contextualization of sentence structure [87]. SA helps to determine whether textual content is positive, negative, or neutral by analyzing feelings and opinions using deep learning (DL) or natural language processing (NLP) techniques [137]. DL is a subset of machine learning methods that can be used to identify patterns or perform complex tasks using data [173]. On the other hand, NLP refers to the ability of computer programs to understand text or spoken language in a similar way to humans through the constant training of AI models [85].

The basic SA model consists of feature extraction (transforming raw data into statistics), a training set (dataset that will be used to train the model), and a classifier (model to be used to analyze data), which indicates the polarity of the textual content [137]. Knowledge-based sentiment analytics (KBSA), an improved version of SA, is used to extract features such as emotions, linguistics, sarcasm, and lexicons from customer feedback. To manage the different feelings indicated in review comments and forecast consumer opinions of items, multi-sentiment analysis, which uses multiple models, is applied to customer review data collected from different sources such as blogs, Facebook, Twitter, or e-commerce websites. KBSA also includes tokenization, where long sentences are broken down into smaller phrases, to which scores are allocated to determine their polarity [97]. Breaking down sentences into single words provides a sophisticated view of the overall sentiment at a granular level to facilitate the evaluation of emotional tone [60]. A sentence such as “The movie was not only captivating but also brilliantly directed” was tokenized in emotion analysis by breaking it up into individual words such as “The” “movie”, “was”, “not”, “only”, “captivating”, “but”, “also”, “brilliantly”, and “directed”. The emotion associated with each token was then evaluated, enabling the analysis to identify both positive and negative feelings found in the text.

Contrary to the conventional document and sentence sentiment analysis, ABSA (Aspect-Based Sentiment Analysis) looks at the viewpoint directly, which is the root of the sentence, thus making it more relevant to the context. ABSA can also include topic modelling (TM) abilities used to explore and find patterns automatically from sentences. TM conducts clustering (grouping of words), finds patterns, and determines the probabilistic distribution of topics. Figure 2 summarizes the types of sentiment analysis, from traditional to knowledge-based and aspect-based sentiment analysis.

Fig. 2
figure 2

Types of Sentiment Analysis (adopted by the authors from [7, 120, 124])

Therefore, it is evident that sentiment analysis has shifted from traditional approach to one that is more aspect-based approach in order to provide information beyond sentient polarity and sentiment scores [115, 168]. For instance, customer trust and loyalty can be determined through aspect-based sentiment analysis and these factors could help a business understand whether a customer has been successfully retained. Furthermore, numerous studies have focussed on ways to improve digital marketing strategies to increase sales, taking into consideration various purchasing determinants including competitiveness, pricing strategies, discounts and product ratings [8, 40, 53, 154]. However, they have not considered merging these purchasing factors with sentiment data to obtain a better forecast of customers’ future purchasing behaviors. Significant attention has been directed towards the utilization of artificial intelligence, deep learning and natural language processing techniques for the extraction and analysis of sentiment expressed in customer reviews [109, 117]. However, some studies [43, 89, 163] have constructed predictive models based solely on sentiment polarities, overlooking crucial factors obtained from reviews such as customer trust, loyalty and customer retention. Conversely, others [79, 81] have primarily focussed on trust, neglecting the exploration of other purchasing factors that could be derived from customer reviews. In the light of these findings, this study proposes to have a sentiment-based predictive model that uses sentiment factors (polarity, sentiment score, trust, loyalty and retention) merged with purchasing history to forecast the subsequent purchase intentions of customers. The next section explains the research methodology that has been used for the study.

Research methodology

A systematic literature review (SLR) approach was adopted to address the research questions of this study. SLRs provide clear identification, analysis, and display of data collated from previous research conducted in the chosen area of study [116]. This strategic evidence is useful to researchers in different fields, such as artificial intelligence (AI). The use of SLR sometimes lacks flexibility if the guidelines are not followed properly, because it requires more in-depth research and analysis of sentiment-based predictive modelling [76]. However, SLR is appropriate for this study, as it provides a more comprehensive and evidence-based structure because of the various previous studies that are well tabulated with proper identification of trends, patterns, and inconsistencies [25]. Furthermore, the factors that were considered during their proposed framework/model were also analyzed, enabling a better identification of research gaps.

Review protocol: PRISMA-P

There are many review protocols available that can be used to conduct a systematic literature review. Two of them are meta-analysis reporting standards (MARS) and preferred reporting items for systematic reviews and meta-analysis protocols (PRISMA-P). MARS could have been used for this study. However, it is more suitable for studies involving statistical methods and quantitative results in their reviews of research topics [76, 128]. On the other hand, Prisma-P is a technique used to review academic journals and articles prior to formulating the research questions. With this technique, the inclusion and exclusion criteria are stipulated, and the data extraction and search approach is used based on the findings of various authors [49]. It helps minimize research bias by carefully considering the research background, research questions, search strategy, selection of studies, quality assessment, and data extraction and synthesis of data [28]. Therefore, PRISMA-P has been chosen for this study, comprising identification, screening, eligibility, and inclusion phases.

Identification is also known as the screening phase, and authors search for journals/articles from specific databases, such as ProQuest/Scopus, based on keywords or paper titles. Screening refers to a review of the papers that have been gathered based on the identification phase. Predefined quality assessment (QA) questions were used to filter journal papers and reduce the number to those that were more relevant to the study according to their titles and abstracts. In the eligibility phase, papers are further divided into different categories based on the database filter option to select papers that have been used in the field of consumer behavior, purchasing factors, sentiment analysis, and predictive modelling. For the inclusion phase, a quality assessment plan is used to give scores to the selected papers that were used to answer the research questions.

Following the steps in the PRISMA diagram (see Fig. 3), 150 journal papers were considered in this study, comprising 30 journal papers for the analysis section and 120 journal papers for all other sections. The 30 selected papers were suitable for this study because they covered sentiment and predictive modelling in a marketing context. Furthermore, they were specific to buyers’ journeys, which helped to further improve the findings of this study with relevant comparison data for the different purchasing factors and behaviors.

Fig. 3
figure 3

PRISMA-P for SLR (Compiled by authors)

All papers used to answer the research questions were published between 2021 and 2024, thus covering the latest technological developments of the various models available. Additionally, many of these 30 papers provided sufficient information to address the research questions of this study. To filter the different studies, the following formula was used and the number of journals has been shown in Fig. 3:

X = papers in green boxes, indicating the initial number of papers considered.

Y = papers in red boxes, indicating those that were eliminated.

Z = papers still in the green box to be considered in the next phase.

Z = X – Y

The next section explains how these journal papers were sourced (Fig. 3).

Search strategy

In this phase of the systematic review, the ProQuest database was used to search for relevant papers. These were not limited to specific journals, such as the International Journal of Data Science and Analytics, Journal of Machine Learning Research, Journal of Sentiment Analysis and Opinion Mining. Instead, journals were explored based on the research questions and the main keywords such as “sentiment” and “predictive modelling” and search phrases such as “customer purchase prediction model” and “use of sentiment analysis for buyers’ reviews”.

However, identifying publications based on keywords and search phrases alone is not very efficient, as it depends on word similarity using weights of terms and shared references (existence of the same citations across multiple papers) only [28]. Citation search is a good approach for enhancing the search for papers. Backward citation searching, which involves looking at references cited by a particular work, was also used [48]. All publications were then stored on Endnote to allow direct citations from search databases through direct export [30], facilitate the removal of duplicates, and organize papers in a systematic way under clearly labelled folders.

Inclusion and exclusion criteria for study selection (filtering process)

The inclusion and exclusion criteria were established to ensure that the selected studies were current and relevant. Abstracts were analyzed using EndNote, and a new subgroup (SG1) was created on the reference manager to retain only those papers that were relevant to the research questions [34]. All journal papers addressing customer prediction models and using customer data to predict their purchase or analyze trends to forecast their purchasing intentions were saved. Regarding sentiment analysis, studies providing the steps used to analyze customers’ textual reviews using different algorithms were considered. Journal papers exploring purchasing history data were also included in SG1 and systematically analyzed. Papers older than 2021 were excluded from this review because the field of prediction analytics has evolved rapidly, and recency for SLR is considered primordial.

Quality assessment

This study aimed to review journals covering sentiment and predictive analytics within the field of marketing—buyers’ journeys. The limitations of the current models were determined and presented in a tabular format, along with previous studies that have addressed various gaps. Furthermore, customer purchasing behaviors obtained from different studies were analyzed to understand the areas that require more focus. Moreover, a quality assessment of the papers was required to ensure that relevant papers were used for this study. The Quality Assessment (QA) stage ensures that only quality papers are selected for the study. As advised by Jadhav, Gaikawad and Bapat [66], three levels of quality schema, categorized as high, medium, and low, were used.

This study focused on addressing the following research questions:

  1. 1.

    What are the predictive and sentiment analysis models used to improve Marketing 5.0?

    1. a.

      What predictive analytics models have been used to improve Marketing 5.0?

    2. b.

      What sentiment models have been used by previous studies for analyzing customer reviews?

  2. 2.

    What are the factors considered in online purchase predictive models?

  3. 3.

    What are the challenges and limitations of existing sentiment-based predictive models?

    1. a.

      What was missing from previous sentiment-predictive models?

    2. b.

      What are the challenges in terms of dataset transformation and model development?

Several journal papers were used to answer the RQs, and quality scores (QS) were applied to filter papers by assigning scores to them based on several quality assessment questions.

Quality scores were assigned to each SG1 paper based on the following criteria:

  • QA1: Does the paper address customer purchase predictive analysis?

  • QA2: Was sentiment analysis used to explore customer reviews?

  • QA3: Were algorithms or technologies used to predict customer purchase intention clearly explained?

  • QA4: Have the factors considered for customer purchase predictions been adequately discussed?

The scores for each paper were calculated based on the quality assessment questions using a scale of 0 to 2, where 0 meant that the paper did not fulfil the QA being evaluated (No), 1 meant that it partially addressed it (P), and 2 meant that it fully covered the QA being assessed (Y). To avoid assessing papers based only on linguistic terms, scores were used by converting all Y, N, and P into numerical forms for better interpretability [12].

The total for all four QAs were calculated using the following formula:

F.S (Pn) = \({\sum }_{{\varvec{i}}}^{{\varvec{n}}}\mathbf{Q}\mathbf{A}\mathbf{i}\),where F.S stands for the final score, Pn stands for the paper number, and i is the value given for the different QAs.

Therefore, when \(\mathbf{F}.\mathbf{S}(\mathbf{P}\mathbf{n})={\sum }_{{\varvec{i}}}^{{\varvec{n}}}\mathbf{Q}\mathbf{A}\mathbf{i}\)  ≥ 6, the paper has a high QA, 4 ≥  \(\mathbf{F}.\mathbf{S}(\mathbf{P}\mathbf{n})={\sum }_{{\varvec{i}}}^{{\varvec{n}}}\mathbf{Q}\mathbf{A}\mathbf{i}\)  ≥ 5, medium QA is inferred; and when \(\mathbf{F}.\mathbf{S}(\mathbf{P}\mathbf{n})={\sum }_{{\varvec{i}}}^{{\varvec{n}}}\mathbf{Q}\mathbf{A}\)  < 4, this implies low QA. Table 1 presents these explanations and helps eliminate any search bias and increases the validity of the literature review [13].

Table 1 Quality Assessment (QA) Metrics

The different QA questions were used to determine which papers were eligible for the study and they were filtered accordingly through the eligibility phase shown in Fig. 3. After this process, 30 journals were selected; and their individual scores and QA assessments are presented in Table 2. All the high and medium QA papers from the remaining papers were used to answer research questions that were related to SA and PM, while the low QA were papers which were used to answer purchasing factors related research questions.

Table 2 Quality assessment of academic papers

Reporting review

After conducting the QA for all shortlisted papers, the results of the systematic review were reported [84], contributing to extend research studies on customer predictive analytics and answering the research questions of this study. This is discussed next.

Results and findings

This section uses the findings from previous studies to answer the research questions presented in the methodology section. This was achieved by examining research gaps within previous models, constraints in their study analysis, performance metrics, and the application contexts (such as marketing, sales, stock pricing, etc.). In order to broaden the understanding of purchasing patterns this current study focussed on different product types and services to analyze gaps rather than focussing on only one specific product.. Consequently, although product type is used as a predictive variable to track customer purchase, it is not the only element that can be used. There are multiple other factors to consider such as customer feeling, the need to purchase, budget, post-purchase emotion, etc. Therefore, this study proposes a sentiment-based predictive model to address the existing gaps in the context of customer purchase behaviors. The reviews are presented in a tabular format to better understand how different models have been used in different scenarios to assess their impact on marketing outcomes. Studies have demonstrated that predictive analytics can accurately forecast outcomes based on various factors that contribute to customer purchasing decisions (satisfaction, rating, reviews, and loyalty). The models were able to identify the key factors that influence customer behavior and used these factors to make predictions. The results vary under different circumstances, and are explained in the following subsections.

Predictive analytics approaches (RQ1-a)

As discussed in Sect. "Data preprocessing", predictive modelling (PM) is an AI sub-component that enhances marketing 5.0, with its ability to forecast customer behaviors and preferences [19]. PM can also be used to analyze customers’ purchase intentions based on historical data. Multiple approaches have been used to develop predictive models in existing research. Figure 4 depicts the four different approaches along with their associated models which will be explained.

Fig. 4
figure 4

Mind map for PM approaches and models (prepared by the authors)

Classical machine learning

Classical machine learning (CML) is a traditional approach that uses algorithms such as support vector machines (SVM), logistic regression (LR), factorization machines (FM), decision trees (DT), and random forests (RF). These methods use manual feature engineering, where relevant features are filtered from the input data to be used in the model for proper analysis.

Ensemble learning

Ensemble learning (EL) involves combining different models to develop a more robust model that will, in most cases, outperform the individual models when evaluated. The main characteristics of EL are bagging and stacking concepts [104]. Bagging refers to the process of training multiple instances of the same algorithm on different subsets of training data. Stacking is a method of combining the predictions of multiple base models through multiple training sets [180]. Examples include extreme gradient boosting (XGBoost), gradient boosted decision trees (GBDT), and light GBM (LGBM).

Deep learning

Deep learning (DL) is a sub-component of machine learning (ML) that involves neural networks with multiple layers capable of learning patterns from large datasets [130]. It is suitable for natural language processing, such as speech recognition and textual review analysis [24]. Examples are Long-Short Term Memory (LSTM), bidirectional encoder representation from transformers (BERT), and generative pre-trained transformer 3 (GPT-3). However, the problem of vanishing gradients arises when DL algorithms are used. This occurs when there are multiple layers of data within the data network, thus causing issues in updating the weight of the predecessor layer when moving to a new layer for analysis. Consequently, the weight gradient for the different layers is not updated properly, thereby producing less efficient results [92]. A technique used to minimize this effect is batch normalization, which requires the scaling and centering of data [163].

Fusion model

The fusion model is a combination of the outputs from different models to obtain the final prediction from a dataset. It can be used at different levels, including the feature level (combining extracted features), decision level (combining model predictions), and sensor level (integrating data from various datasets) [1].

Predictive modelling algorithms (RQ1-a)

The approaches adopted for predictive modelling can be different as discussed in Sect. "Predictive analytics approaches (RQ1-a)". However, data pre-processing and feature engineering phases are commonly used as explained in Sect. "Data preprocessing". Table 3 lists and explains the various predictive models used in previous PM studies.

Table 3 Algorithms/Models description and usage—prepared by the authors

Limitations from previous studies encountered with the different models have been summarized under the weakness column. The number of papers that addressed the different models have been shown in Fig. 5.

Fig. 5
figure 5

Algorithms vs. papers reviewed (prepared by authors)

Figure 5 shows the number of studies (x-axis) that use the various models listed (y-axis). As evident, the papers reviewed in this study did not use (GPT), whereas six out of the 30 papers used Bi-directional encoder from transformers (BERT) and Support vector machine (SVM). Seven of the studies used Long Short-term memory (LSTM) and Decision tree (DT), while five used linear regression. For the remaining studies, Factorization machine (FM), Naive Bayes (NB), Random Forest (RF), Light gradient-boosting Machine (LightGBM), Extreme gradient boosting (XGBoost), and Gradient boosted decision tree (GBDT) were used. Many of these algorithms, which have different strengths, can be used for both sentiment and predictive models. Therefore, hybridizing two of these models to form a sentiment-based predictive model based on the nature of a dataset could offer a good solution to issues within the customer purchase field.

Application of sentiment models for customer reviews analysis (RQ1-b)

Some of the weaknesses listed in Table 3 can be mitigated by integrating sentiment models. For instance, BERT has a complex data architecture and requires an appropriate training phase. Therefore, training a dataset with sentiment and predictive models of BERT can help minimize complexity by better understanding the data relationships better [181]. Furthermore, sentiment analysis involves data transformation, classification, training, and evaluation. This helps improve the categorization issues faced by many fusion and ensemble algorithms. The conversion of raw data into labelled data, followed by the training phase, results in a more scalable and flexible dataset that can be evaluated and visualized more easily [41]. Chen et al. [33] proposed a scalable DL model that uses an Apache dataset for analytics. They found that DL algorithms are more efficient when sentiment analysis is conducted in multiple phases of data preparation, feature extraction and polarity determination. Apart from sentiment models, there are also several side algorithms, such as the Genetic Algorithm (GA) and Firefly Algorithm (FA), that can be used to further enhance the efficiency of SA. The next section explains the concepts of GA and FA.

Genetic and firefly algorithm

Natural selection serves as a model for optimization algorithms, known as genetic algorithms. GAs can be produced by changing a collection of parameters or characteristics. They can be used in SA to improve the performance of sentiment classification models by tuning the data architecture to combine the variables. FA is the process of mimicking the flashing nature of fireflies, where brighter fireflies have a greater attraction power. Similarly, FA determines the groups of variables using the feature weight and parameter values. LSTM-DGWO (long-term memory with differential grey wolf optimization) is used to improve the optimization phase of the SA. Datasets were divided into different levels for better training of the models [20]. Furthermore, SA involves a text classification phase, thus it is important that models understand the context of the sentence being evaluated. Latent Dirichlet Allocation (LDA) is a probabilistic model that can be used for topic modelling to understand words within sentences being evaluated for their sentiment polarity [97, 137]. The next section explains how different deep learning models can be merged to produce better results.

Random multimodal deep learning

Random multimodal deep learning (RMDL) was used to combine diverse deep learning models to enhance SA performance. Local search with improved binary ant lion optimization (LSIBA-ENN) helps to optimize feature selection for classification tasks [20]. CNBL (convolutional neural network with binary layers) integrates binary layers into convolutional neural networks for efficient modelling [62]. SLCABG (semi-supervised learning with class-wise adversarial binary generative models) improve binary generative models for semi-supervised learning when training models using datasets.

When performing SA, ensemble learning algorithms, such as XGBoost, can be used. Therefore, extreme random forest with XGBoost (ERF-XBG) can strengthen the performance of SA with a better data classification [109]. The next section explains the findings and models of different sentiment-based studies.

Sentiment analysis models (RQ1-b)

Table 4 presents the studies, models used and the findings.

Table 4 Sentiment Analysis Models

Numerous studies have used machine learning and deep learning algorithms to analyze sentiments, demonstrating the efficacy of random forest, Naïve Bayes, and support vector machine classifiers. A BERT-integrated model using DGSO and LSTM was able to attain a remarkable 98% accuracy for sentiment categorization. The ensemble Bagging SVM and BERT/CNN showed good accuracies of 96.1% and 99.23% in restaurant and e-commerce sentiment analysis, respectively. Therefore, depending on the dataset, the fusion models performed well in terms of the evaluation metrics. Furthermore, most of the papers mentioned above used secondary data, that is, data extracted from social media endpoints or open sources, such as Yelp or Kaggle. Therefore, multiple other factors, such as feelings, emotions, reviews and ratings, and trust can and should be extracted from textual content since they play an important role in the buyer’s journey [119], and help to ensure a better customer experience. Therefore, the extraction of other variables by means of sentiment analysis can help to predict whether customers will be repurchasing from a specific business [175]. The next section discusses other purchasing factors that can be used to analyse buyers’ journeys.

Factors for online purchase predictive models. (RQ2)

This section discusses the different factors and behaviors that influence customers during their online purchasing journey, such as customer service, product brand, customer ratings and reviews, and attraction to the product.

Purchasing factors

After reviewing the shortlisted papers, several factors were found to have been considered when determining customers’ purchasing patterns. These include customer behaviors and other business-related factors, as presented in Fig. 6 [27].

Fig. 6
figure 6

Mind map for purchasing factors (prepared by the authors)

In the social media era, customer reviews have become a major factor in online purchase decisions. Businesses are increasingly focusing on tracking customer satisfaction through reviews to improve sales and track purchasing trends. Reviews can be used to determine customer preferences, attitudes, and loyalty, as well as to optimize marketing and sales strategies [9]. Customer satisfaction is another factor that drives online purchase decisions. Defined as the extent to which a business can resolve purchasing issues, high customer satisfaction can lead to better reviews, more online sales, and increased customer loyalty. The latter occurs when buyers repeatedly purchase from the same seller, regardless of competitors’ advertisements [146]. Loyalty can be determined by analyzing the purchasing patterns evident in customers’ purchase histories. Branding, pricing, and discounts should also be considered when analyzing patterns and retaining customers [110].

Branding is the process of creating a strong marketing perception that attracts customers and influences purchasing intentions [53]. Pricing is another important factor that customers use to evaluate their purchases and can vary depending on the brand. A favorable price can attract more customers and increase loyalty and retention probabilities. Additionally, sellers use discounts to attract customers by offering reduced prices for online sales [110]. Customers often take this opportunity to purchase more products from their preferred sellers, thus increasing the probability of a subsequent purchase because of greater customer satisfaction. The interface and web design of a commerce website also affect online customers. Visual design factors such as interface color, font type and size, and images have a positive impact on customers' trust, while ease of navigation improves customers’ journey by helping them find their preferred products more easily [123]. A good user interface design leads to a better display of products, which enhances customers' purchasing pleasure and experience during their purchase journey. Customers also highly value their personal data protection, and e-commerce websites with clear buyer data policies can easily attract online buyers and build buyer–seller trust.

Strong trust between customers and sellers can increase the loyalty factor, which can be integrated with sentiment and purchasing history to further enhance the predictive probability of customer repurchasing [40]. The number of papers addressing each of these factors is shown in Fig. 7.

Fig. 7
figure 7

Purchasing factors considered (prepared by the authors)

Previous studies [143, 157, 174, 176, 178] have addressed customer satisfaction, attraction, customer reviews, pricing, and product ratings more often than factors such as customer retention, loyalty, and trust. Customer trust plays a fundamental role in cultivating loyalty and ensuring customer retention, as emphasized by studies. This trust can be established by aligning products and services with customer needs, ultimately influencing satisfaction levels and fostering long-term relationships. Loyalty manifests in customers’ repeated purchases and positive word-of-mouth, which significantly impact business success [98]. Despite its importance, the examination of customer loyalty remains limited, particularly in the context of mobile and online shopping, where customer switching costs are low [169, 174, 176, 178]. Therefore, the integration of sentiments yielded by customer reviews with purchasing history data enables the behavioral patterns to be identified, thereby facilitating the prediction of future purchase behaviors [113]. Additionally, customer retention efforts focus on understanding customer behavior and preferences so as to enhance satisfaction levels and predict future purchasing needs [50, 138, 140, 153], thus highlighting the need for a sentiment-based predictive model.

Moreover, a valuable asset of every online business is its faithful customer base with readily available funds. Therefore, businesses attempt to convert current customers into loyal ones to ensure repeated purchases from the same seller using different techniques, including electronic word-of-mouth (eWoM), influencer marketing, and the posting of online customer reviews and testimonials [132]. Therefore, the extraction of such factors (trust, loyalty and retention) through aspect-based sentiment analysis could be crucial to enhancing current marketing strategies, thus highlighting the importance of a sentiment-based predictive model.

Since influencer marketing involves the usage of social media, from which reviews can be obtained, this is explained in the next section.

Influencer marketing

Influencer marketing involves using well-known or less-known opinion leaders with sizable followings on social media to encourage favorable attitudes and actions from customers towards the brand, thus retaining them for the business. Any type of digital media-based positive or negative communication regarding a product or service is known as electronic word-of-mouth. This can drastically alter how consumers make judgments about what to buy by serving as the contemporary equivalent of old-fashioned word-of-mouth advertising. eWoM has the power to influence customer decisions to remain loyal to a certain business, thus ensuring a favorable evaluation that can boost revenue. Moreover, consumers can share their opinions about their experiences through websites, which can be extracted for sentiment analysis. A few studies have considered factors such as discounts and page interfaces. However, these were not selected for this study because the focus is on ensuring a smoother buyers’ journey by taking into account customer trust, loyalty and retention through a sentiment-based predictive model. However, when developing such a model, there are certain constraints and challenges. These have been discussed in the next section.

Challenges and limitations of sentiment-based predictive models (RQ3)

This section discusses the challenges that can be faced during the development of a sentiment-based predictive model for the marketing field: online customer purchase and data analysis. Furthermore, previous models have several limitations that have been explained.

Limitations of previous sentiment-predictive models (RQ3-a)

A total of 150 papers were used for the current SLR and 30 were filtered because they considered sentiment or predictive models to improve their current marketing service. Of the 30 papers, only nine addressed both sentiment and predictive modelling from a marketing perspective. Various models have been used to determine customer feelings about a product, such as SVM, BERT, K-Means, LSTM, and NB. However, ensemble learning was not used in the study, and there was no hybridization of algorithms, which would have increased the performance metrics used. Moreover, LSTM is known to have good predictive metrics when it comes to customer data [61].

Liu and Ying [93] applied SNOWNLP to analyze customer sentiment and used LSTM for their predictive model. However, there is room for improvement in the preprocessing phase, particularly in terms of enhancing the data structure and conducting more thorough data cleaning. Furthermore, RF could have been merged with LSTM to increase the efficiency by mitigating the weakness of each individual model and providing better data modelling [172]. For the sentiment analysis phase, algorithms such as BERT can be used to increase the reliability of research [145]. Ullah et al. [155] proposed the use of the BiLSTM and QLeBERT algorithms for sentiment and predictive modelling. However, their dataset did not have a resource pool of languages (lexicon), which would have improved the research. For example, the model was not trained to detect sarcastic contexts, and the tokenization phase could have been made more efficient by dividing the process into sub-processes. Breaking a sentence into two parts and further breaking it down to single words would improve the performance metrics because of more efficient classification and analysis of different groups of words [7].

As illustrated in Fig. 8, of the nine papers, only two used LSTM and BERT despite the limitations of these approaches. The two algorithms can be merged to form a sentiment-based predictive model. Furthermore, only two of the 30 papers (6.7%) used LSTM and BERT for the sentiment-based predictive model, indicating that this area requires more focus. This is supported by the facts presented in Tables 3 and 4, where the evaluation metrics for LSTM and BERT were among the highest when tested on different types of datasets. Additionally, when working with Kaggle secondary customer dataset for the proposed model, LSTM and BERT present viable options due to their capacity to discern connections among diverse data variables, consequently mitigating evaluation margin errors [35]. Moreover, during the development of these models, the inclusiveness of a variety of data variables (columns) can serve as a pivotal factor. Inadequate coverage, such as lacking emotional data, purchasing history, and product details [100] can lead to biased or less accurate results. To minimize such possibilities, it is imperative to ensure a proper balance among pre-existing biases and limitations (sarcasm, low-quality data, noise, uncleaned data, etc.), thereby ensuring an accurate analysis of customers’ purchasing intentions.

Fig. 8
figure 8

Papers using LSTM and BERT (prepared by the authors)

Model development challenges (RQ3-b)

As claimed by Kim et al. [82, 83], deep learning can be used to conduct a sentiment analysis of customer reviews. However, the main challenge is the sarcasm detection based on the structure and context of the sentences. A key sign for sarcasm identification is the sentiment incongruity of words in a phrase; that is, the contrast between positive and negative concepts. Xiong et al. [166] and Tay et al. [151] proposed a solution to overcome the sarcasm issue by considering similarity and assigning greater weight ratings to highly comparable terms. However, this approach cannot efficiently identify inconsistent information [60]. Therefore, aspect level sentiment analysis can be used in the proposed model to enhance the capabilities of LSTM to analyse word by word for better interpretability.

Dataset transformation challenges (RQ3-b)

Baroiu and Stefan [21] proposed the “MUStARD” multimodal dataset, which can detect sarcasm by incorporating data from a comedy series. However, this was not sufficient to detect sarcasm in customer reviews due to contextual differences. Furthermore, sentiments expressed in words can sometimes differ from what is actually felt [7]. Data quality can be another challenge if the datasets are not properly cleaned and are divided into training and test sets [82, 83]. Therefore, the phases involving strict data preprocessing and feature engineering are important to ensure that a good quality dataset is available for SA and PM.

Overfitting and underfitting issues (RQ3-b)

When conducting PM, overfitting and underfitting have often been an issue in previous studies [135]. Overfitting occurs when a dataset is too complex and the model cannot be trained sufficiently because of time limitations or complex relationships between variables. Conversely, underfitting occurs when a model is too simple and cannot capture the patterns required to obtain the proper evaluation metrics [131]. Therefore, the use of scaling and normalization concepts can help to reduce the chance of having outliers and to decrease the error margin of the proposed model.

After answering all the research questions, the findings are illustrated in Fig. 9 and Fig. 10.

Fig. 9
figure 9

Research questions and findings (compiled by the authors)

Fig. 10
figure 10

Concept map of findings (prepared by authors)

Significance of the study

This section has been divided into theoretical and practical significance.

Theoretical significance

In terms of theoretical significance, this study highlights areas requiring improvement in the predictive modelling field and provides a better understanding of the integration of predictive and sentiment analysis within the context of Marketing 5.0, by placing greater focus on specific processes in the buyers’ journey. Moreover, this study provides insights on the limitations and best usages of different models which can be applied in different contexts of sentiment and predictive analytics.

Practical significance

The proposed sentiment-based model will help businesses better understand customer purchases, thereby enabling them to provide customers with products chosen to meet their needs [42]. Furthermore, this novel model serves as an emotional driver by capturing customer attitudes and opinions and merging these with their purchasing history to help businesses understand customer purchase behaviors [115]. Nowadays, buyers tend to check product branding before purchasing [23]. Consequently, such a model can help marketers develop more specific targeted marketing strategies that address negative customer reviews and focus on positive ones [61], ultimately enhancing customer experiences [63]. On the other hand, customers can receive advertisements for products that help them during their journey. By receiving targeted advertisements aligned with their preferences, customers can make informed choices, fostering a more satisfying and efficient shopping experience [144].

Limitations and future work

This study discussed the various models (LSTM, BERT, XGBoost, SVM, NB) used to conduct sentiment analysis. However, it does not examine in detail the larger language models (LLM) such as ChatGPT or Gemini in its analysis section. Therefore, some suggestions for future work in the field of predictive and sentiment analysis include: first, exploring the use of deep learning and a large language model (LLM) for advanced customer-oriented datasets for sentiment analysis; second, developing models that are specifically tailored to different types of online buyers based on their online behavior; and lastly, developing robust models that can understand the relationship between customer sentiments and other e-commerce factors, which could lead to an improvement in current predictive models. Moreover, since this study focusses on the customer’s purchasing journey, future studies could analyse the applications of sentiment-based predictive models in other fields such as healthcare and education. For instance, facial expressions and textual communication can be used to determine the sentiments of mental patients to predict their medicine dosage [12]. Such models can be used to analyze the sentiments of students in class (frustrated, happy, confused, etc.), thus identifying students with consistent negative sentiments to determine whether they need extra support. In terms of comparative parameters, this current work was based on the findings of other studies. Therefore, in future studies, experiments could be conducted involving a specific product online and an analysis of textual reviews in order to forecast of the number of potential buyers. In terms of data extraction, social data and metadata from different sources can increase the dataset size for better training of models to improve customer interaction and sentiment analysis [5]. Additionally, with emerging technologies, such as GPT4 models and Bard, a comparative study can be conducted to enhance the literature and improve the technologies used to implement such models. For such tasks, natural language inference (NLI), which is a natural language processing (NLP) task, can be useful because it helps models to understand the nuances of human language and make logical inferences from text, thus understanding customer reviews better to allow appropriate product recommendations [6].

Conclusion

Based on the findings of previous studies, it can be determined that businesses need to transition to the latest technologies from Industry 4.0 (PM, SA, and chatbots, etc.) to show a strong presence in the Marketing 5.0 era. As it can be analyzed, having such transition data is key. To obtain a robust predictive model, data need to be thoroughly cleaned, and new insights such as customers’ emotions, feelings, processes, and others need to be identified and well-integrated with historical data. Furthermore, it is crucial to extract data from the various phases of buyers’ journey, such as the awareness, decision-making, and purchase stages, as it helps to have more variables to train the model, resulting in a robust model.

Sentiment-based predictive models for online buyers have the potential to revolutionize the way e-commerce businesses operate. By analyzing the sentiments of online buyers, businesses can better understand their customers’ needs, preferences, and difficulties. This information can be used to improve products and services, target marketing campaigns more effectively, and reduce customer churn. However, as this systematic review has shown, there is still room for improvement in the development and application of sentiment-based predictive models for online buyers by considering more online purchasing factors and customer behaviors. Furthermore, the lack of an appropriate and comprehensive dataset containing online purchasing history and sentiment affects the accuracy of these models. Additionally, many existing models are not sufficiently robust to handle the diversity of online reviews and the complex relationships between sentiment and other variables, such as product features and buyers’ characteristics.

In this study, PRISMA was used to conduct a systematic literature review because it provides a better structure and clarity to the findings from different studies, thus helping to answer the research questions more efficiently. Journals were identified using keywords from the ProQuest database. They were then screened to make the search more accurate in selecting papers eligible for the study. However, this study has several limitations, one of which is that predictive models were explored in the marketing field. Hence, future studies could extend this research by including a range of different domains.

Data availability

Not applicable.

Code availability

Not applicable.

References

  1. Abdar M, Salari S, Qahremani S, Lam H-K, Karray F, Hussain S, Abbas Khosravi U, Acharya R, Makarenkov V, Nahavandi S. UncertaintyFuseNet: robust uncertainty-aware hierarchical feature fusion model with ensemble monte carlo dropout for COVID-19 detection. Inf Fus. 2023;90(February):364–81. https://doi.org/10.1016/j.inffus.2022.09.023.

    Article  Google Scholar 

  2. Abidar L, Zaidouni D, Ikram ELA, Ennouaary A. Predicting customer segment changes to enhance customer retention: a case study for online retail using machine learning. Int J Adv Comput Sci Appl. 2023. https://doi.org/10.14569/IJACSA.2023.0140799.

    Article  Google Scholar 

  3. Abrego N, Ovaskainen O. Evaluating the predictive performance of presence-absence models: why can the same model appear excellent or poor? Ecol Evol. 2023. https://doi.org/10.1002/ece3.10784.

    Article  Google Scholar 

  4. Abreu LR, Maciel ISF, Alves JS, Braga LC, Pontes HLJ. A decision tree model for the prediction of the stay time of ships in brazilian ports. Eng Appl Artif Intell. 2023;117(January):105634. https://doi.org/10.1016/j.engappai.2022.105634.

    Article  Google Scholar 

  5. Abu-Salih B, Alotaibi S. Knowledge graph construction for social customer advocacy in online customer engagement. Technologies. 2023;11(5):123. https://doi.org/10.3390/technologies11050123.

    Article  Google Scholar 

  6. Abu-Salih B, Alweshah M, Alazab M, Al-Okaily M, Alahmari M, Al-Habashneh M, Al-Sharaeh S. Natural language inference model for customer advocacy detection in online customer engagement. Mach Learn. 2023. https://doi.org/10.1007/s10994-023-06476-w.

    Article  Google Scholar 

  7. Ahmed K, Nadeem MI, Zheng Z, Li D, Ullah I, Assam M, Ghadi YY, Mohamed HG. Breaking down linguistic complexities: a structured approach to aspect-based sentiment analysis. J King Saud Univ Comput Inf Sci. 2023;35(8):101651. https://doi.org/10.1016/j.jksuci.2023.101651.

    Article  Google Scholar 

  8. Akter S, Ali S, Fekete-Farkas M, Fogarassy C, Lakner Z. Why organic food? Factors influence the organic food purchase intension in an emerging country (Study from Northern part of Bangladesh). Resources. 2023;12(1):5. https://doi.org/10.3390/resources12010005.

    Article  Google Scholar 

  9. Al-Abbadi L, Bader D, Mohammad A, Al-Quran A, Aldaihani F, Al-Hawary S, Alathamneh F. The effect of online consumer reviews on purchasing intention through product mental image. Int J Data Netw Sci. 2022;6(4):1519–30. https://doi.org/10.5267/j.ijdns.2022.5.001.

    Article  Google Scholar 

  10. Alghazzawi DM, Alquraishee AGA, Badri SK, Hasan SH. ERF-XGB: ensemble random forest-based XG boost for accurate prediction and classification of E-commerce product review. Sustainability. 2023;15(9):7076. https://doi.org/10.3390/su15097076.

    Article  Google Scholar 

  11. Alharbi ZH. A sustainable price prediction model for airbnb listings using machine learning and sentiment analysis. Sustainability. 2023;15(17):13159. https://doi.org/10.3390/su151713159.

    Article  Google Scholar 

  12. Ali Y, Khan HU, Khalid M. Engineering the advances of the artificial neural networks (ANNs) for the security requirements of internet of things: a systematic review. J Big Data. 2023;10(1):128. https://doi.org/10.1186/s40537-023-00805-5.

    Article  Google Scholar 

  13. Al-Sai ZA, Husin MH, Syed-Mohamad SM, Abdullah R, Zitar RA, Abualigah L, Gandomi AH. Big data maturity assessment models: a systematic literature review. Big Data Cognit Comput. 2023;7(1):2. https://doi.org/10.3390/bdcc7010002.

    Article  Google Scholar 

  14. Alsayat A. Customer decision-making analysis based on big social data using machine learning: a case study of hotels in Mecca. Neural Comput Appl. 2023;35(6):4701–22. https://doi.org/10.1007/s00521-022-07992-x.

    Article  Google Scholar 

  15. AL-Sous N, Almajali D, Alsokkar A. Antecedents of social media influencers on customer purchase intention: empirical study in Jordan. Intl J Data Netw Sci. 2023;7(1):125–30.

    Article  Google Scholar 

  16. Alzahrani RA, Aljabri M. AI-Based techniques for Ad click fraud detection and prevention: review and research directions. J Sens Actuator Netw. 2023;12(1):4. https://doi.org/10.3390/jsan12010004.

    Article  Google Scholar 

  17. Anas AM, Abdou AH, Hassan TH, Alrefae WMM, Daradkeh FM, El-Amin M-M, Kegour ABA, Alboray HMM. Satisfaction on the driving seat: exploring the influence of social media marketing activities on followers’ purchase intention in the restaurant industry context. Sustainability. 2023;15(9):7207. https://doi.org/10.3390/su15097207.

    Article  Google Scholar 

  18. Atallah SB, Banda NR, Banda A, Roeck NA. How large language models including generative pre-trained transformer (GPT) 3 and 4 will impact medicine and surgery. Tech Coloproctol. 2023;27(8):609–14. https://doi.org/10.1007/s10151-023-02837-8.

    Article  Google Scholar 

  19. Bakator M, Vukoja M, Manestar D. Achieving competitiveness with marketing 5.0 in new business conditions. UTMS J Econ. 2023;14(1):63–73.

    Google Scholar 

  20. Barik K, Misra S, Ray AK, Bokolo A. LSTM-DGWO-based sentiment analysis framework for analyzing online customer reviews. Comput Intell Neurosci. 2023;2023(February):6348831. https://doi.org/10.1155/2023/6348831.

    Article  Google Scholar 

  21. Baroiu AC, Stefan TM. Comparison of Deep learning models for automatic detection of sarcasm context on the MUStARD dataset. Electronics. 2023;666:5. https://doi.org/10.3390/electronics12030666.

    Article  Google Scholar 

  22. Bashir R, Mehboob I, Bhatti WK. Effects of online shopping trends on consumer-buying behaviour: an empirical study of Pakistan. J Manag Res. 2015;2(2):1–24. https://doi.org/10.29145/jmr/22/0202001.

    Article  Google Scholar 

  23. Bełch P, Hajduk-Stelmachowicz M, Chudy-Laskowska K, Vozňáková I, Gavurová B. Factors determining the choice of pro-ecological products among generation Z. Sustainability. 2024;16(4):1560. https://doi.org/10.3390/su16041560.

    Article  Google Scholar 

  24. Benavides-Astudillo E, Fuertes W, Sanchez-Gordon S, Nuñez-Agurto D, Rodríguez-Galán G. A phishing-attack-detection model using natural language processing and deep learning. Appl Sci. 2023;13(9):5275. https://doi.org/10.3390/app13095275.

    Article  Google Scholar 

  25. Bintara R, Yadiati W, Zarkasyi MW, Tanzil ND. Management of green competitive advantage: a systematic literature review and research Agenda. Economies. 2023;11(2):66. https://doi.org/10.3390/economies11020066.

    Article  Google Scholar 

  26. Boehringer AS, Sanaat A, Arabi H, Zaidi H. An active learning approach to train a deep learning algorithm for tumor segmentation from brain MR images. Insights Imagin. 2023;14(1):141. https://doi.org/10.1186/s13244-023-01487-6.

    Article  Google Scholar 

  27. Trebicka B, Tartaraj A, Harizi A. Analyzing the relationship between pricing strategy and customer retention in hotels: a study in Albania. F1000Research. 2023. https://doi.org/10.12688/f1000research.132723.1.

    Article  Google Scholar 

  28. Busalim AH, Hussin ARC. Understanding social commerce: a systematic literature review and directions for further research. Int J Inf Manag. 2016;36(6 Part A):1075–88. https://doi.org/10.1016/j.ijinfomgt.2016.06.005.

    Article  Google Scholar 

  29. Bushara MA, Abdou AH, Hassan TH, Abu EE, Sobaih AS, Albohnayh M, Alshammari WG, Aldoreeb M, Elsaed AA, Elsaied MA. Power of social media marketing: how perceived value mediates the impact on restaurant followers’ purchase intention, willingness to pay a premium price, and E-WoM? Sustainability. 2023;15(6):5331. https://doi.org/10.3390/su15065331.

    Article  Google Scholar 

  30. Butros A, Taylor S. ‘Managing information: evaluating and selecting citation management sofrtware, a look at endnote, refworks, mendeley and zotero’. 2011. https://www.researchgate.net/publication/268428881_Managing_information_evaluating_and_selecting_citation_management_software_a_look_at_EndNote_RefWorks_Mendeley_and_Zotero. Accessed 15 Sept 2023.

  31. Candan SS, Bayram SS. Metaphors perception in personal sales concept: evaluation with logistic regression. Bus Manag Stud Int J. 2023;11(1):208–25. https://doi.org/10.15295/bmij.v11i1.2204.

    Article  Google Scholar 

  32. Chan J-L, Bea KT, Leow SMH, Phoong SW, Cheng WK. State of the art: a review of sentiment analysis based on sequential transfer learning. Artif Intell Rev. 2023;56(1):749–80. https://doi.org/10.1007/s10462-022-10183-8.

    Article  Google Scholar 

  33. Chen SS, Pai TW, Sun CY. 2023. ‘Applying the diamond model of intrusion analysis with generative pre-trained transformer 3’. In: 2023 International conference on consumer electronics—Taiwan (ICCE-Taiwan), 2023. pp.289–90. https://doi.org/10.1109/ICCE-Taiwan58799.2023.10226923.

  34. Cheng X, Chaw JK, Goh KM, Ting TT, Sahrani S, Ahmad MN, Kadir RA, Ang MC. Systematic literature review on visual analytics of predictive maintenance in the manufacturing industry. Sensors. 2022;22(17):6321. https://doi.org/10.3390/s22176321.

    Article  Google Scholar 

  35. Yang C, Fa-you A, Yu-Feng W, Yan SQ, Zhu CB, Zhang H. Impact of parameter tuning with genetic algorithm, particle swarm optimization, and bat algorithm on accuracy of the SVM Model in landslide susceptibility evaluation. Math Probl Eng. 2023. https://doi.org/10.1155/2023/1393142.

    Article  Google Scholar 

  36. Cui J, Bai L, Li G, Lin Z, Zeng P. Semi-2DCAE: a semi-supervision 2D-CNN AutoEncoder model for feature representation and classification of encrypted traffic. PeerJ Comput Sci. 2023. https://doi.org/10.7717/peerj-cs.1635.

    Article  Google Scholar 

  37. Ding Y, Lei X, Liao Bo, Fang-Xiang Wu. Biomarker identification via a factorization machine-based neural network with binary pairwise encoding. IEEE/ACM Trans Comput Biol Bioinf. 2023;20(3):2136–46. https://doi.org/10.1109/TCBB.2023.3235299.

    Article  Google Scholar 

  38. Do T-N, Lenca P, Lallich S. Classifying many-class high-dimensional fingerprint datasets using random forest of oblique decision trees: [Doc 24]. Vietnam J Comput Sci. 2014;2(1):3–12. https://doi.org/10.1007/s40595-014-0024-7.

    Article  Google Scholar 

  39. Dong W, Huang Y, Lehane B, Ma G. XGBoost algorithm-based prediction of concrete electrical resistivity for structural health monitoring. Autom Constr. 2020;114(June):103155. https://doi.org/10.1016/j.autcon.2020.103155.

    Article  Google Scholar 

  40. Ebrahimi P, Khajeheian D, Soleimani M, Gholampour A, Fekete-Farkas M. User engagement in social network platforms: what key strategic factors determine online consumer purchase behaviour? Ekonomska Istrazivanja: Znanstveno-Strucni Casopis. 2023. https://doi.org/10.1080/1331677X.2022.2106264.

    Article  Google Scholar 

  41. Edara DC, Vanukuri LP, Sistla V, Kolli VKK. Sentiment analysis and text categorization of cancer medical records with LSTM. J Ambient Intell Humaniz Comput. 2023;14(5):5309–25. https://doi.org/10.1007/s12652-019-01399-8.

    Article  Google Scholar 

  42. Faiz T, Aldmour R, Ahmed G, Alshurideh M, Paramaiah C. Machine learning price prediction during and before COVID-19 and consumer buying behavior. In: Alshurideh M, Al Kurdi BH, Masadeh R, Alzoubi HM, Salloum S, editors. The effect of information technology on business and marketing intelligence systems. Studies in Computational Intelligence. Cham: Springer International Publishing; 2023. p. 1845–67. https://doi.org/10.1007/978-3-031-12382-5_101.

    Chapter  Google Scholar 

  43. Fang Y, Wang W, Pengcheng Wu, Zhao Y. A sentiment-enhanced hybrid model for crude oil price forecasting. Expert Syst Appl. 2023;215(April):119329. https://doi.org/10.1016/j.eswa.2022.119329.

    Article  Google Scholar 

  44. Farooq U, Ademola M, Shaalan A. Comparative analysis of machine learning models for predictive maintenance of ball bearing systems. Electronics. 2024;13(2):438. https://doi.org/10.3390/electronics13020438.

    Article  Google Scholar 

  45. Faruk M, Rahman M, Hasan S. How digital marketing evolved over time: a bibliometric analysis on scopus database. Heliyon. 2021;7(12): e08603. https://doi.org/10.1016/j.heliyon.2021.e08603.

    Article  Google Scholar 

  46. Feng Z, Mamun AA, Masukujjaman M, Yang Q. Modeling the significance of advertising values on online impulse buying behavior. Humanit Soc Sci Commun. 2023;10(1):728. https://doi.org/10.1057/s41599-023-02231-7.

    Article  Google Scholar 

  47. Ferraz RM, Pereira C, da Veiga C, Pereira R, da Veiga T, Furquim SG, da Vieira Silva W. After-sales attributes in e-commerce: a systematic literature review and future research Agenda. J Theor Appl Electron Commer Res. 2023;18(1):475. https://doi.org/10.3390/jtaer18010025.

    Article  Google Scholar 

  48. Frandsen TF, Eriksen MB. Supplementary strategies identified additional eligible studies in qualitative systematic reviews. J Clin Epidemiol. 2023;159(July):85–91. https://doi.org/10.1016/j.jclinepi.2023.04.017.

    Article  Google Scholar 

  49. Frost AD, Hróbjartsson A, Nejstgaard CH. Adherence to the PRISMA-P 2015 reporting guideline was inadequate in systematic review protocols. J Clin Epidemiol. 2022;150(October):179–87. https://doi.org/10.1016/j.jclinepi.2022.07.002.

    Article  Google Scholar 

  50. Gao S, Meng W. Cloud-based services and customer satisfaction in the small and medium-sized businesses (SMBs). Kybernetes. 2022;51(6):1991–2007. https://doi.org/10.1108/K-05-2021-0376.

    Article  MathSciNet  Google Scholar 

  51. James G, Witten D, Hastie T, Tibshirani R. An Introduction to Statistical Learning. Berlin: Springer; 2013. https://doi.org/10.1007/978-1-4614-7138-7.

    Book  Google Scholar 

  52. Google. 2022. ‘Reducing loss: gradient descent | machine learning’. Google for developers. 2022. https://developers.google.com/machine-learning/crash-course/reducing-loss/gradient-descent. Accessed 15 Sept 2023.

  53. Majumder MG, Gupta SD, Paul J. Perceived usefulness of online customer reviews: a review mining approach using machine learning & exploratory data analysis. J Bus Res. 2022;150(November):147–64. https://doi.org/10.1016/j.jbusres.2022.06.012.

    Article  Google Scholar 

  54. Liu G, Nguyenm T, Zhao G, Zha W, Yang J, Cao J, Wu M, Zhao P. ‘Repeat Buyer Prediction for E-Commerce’. In: KDD ’16: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2016.16:155–64. https://doi.org/10.1145/2939672.2939674.

  55. Gujrati R, Gulati U, Uygun H. Digital transformation has changed consumer behvoiur from traditional market to digital market. Acad Market Stud J. 2023;27(S2):1–6.

    Google Scholar 

  56. Hajek P, Sahut J-M. Mining behavioural and sentiment-dependent linguistic patterns from restaurant reviews for fake review detection. Technol Forecast Soc Chang. 2022;177(April):1. https://doi.org/10.1016/j.techfore.2022.121532.

    Article  Google Scholar 

  57. Hamadani A, Ganai NA, Bashir J. Artificial neural networks for data mining in animal sciences. Bulle Natl Res Cent. 2023;47(1):68. https://doi.org/10.1186/s42269-023-01042-9.

    Article  Google Scholar 

  58. Hassler AP, Menasalvas E, García-García FJ, Rodríguez-Mañas L, Holzinger A. Importance of medical data preprocessing in predictive modeling and risk factor discovery for the frailty syndrome. BMC Med Inf Decis Mak. 2019. https://doi.org/10.1186/s12911-019-0747-6.

    Article  Google Scholar 

  59. Hayati N, Jaelani E. Analysis of digital marketing quality before and during the Covid-19 pandemic on frozen food consumers in West Java Region. Calit Acces La Success. 2024;25(198):149–59. https://doi.org/10.47750/QAS/25.198.16.

    Article  Google Scholar 

  60. He Y, Chen M, He Y, Zhining Qu, He F, Feihong Yu, Liao J, Wang Z. Sarcasm detection base on adaptive incongruity extraction network and incongruity cross-attention. Appl Sci. 2023;13(4):2102. https://doi.org/10.3390/app13042102.

    Article  Google Scholar 

  61. Hicham N, Nassera H, Karim S. A thorough analysis of e-commerce customer reviews in arabic language using deep learning techniques for successful marketing decisions. IAENG Int J Appl Math. 2023;53(4):1–8.

    Google Scholar 

  62. Hodgson EL, Souaiby M, Troldborg N, Porté-Agel F, Andersen SJ. Cross-code verification of non-neutral ABL and single wind turbine wake modelling in LES. J Phys: Conf Ser. 2023;2505(1):012009. https://doi.org/10.1088/1742-6596/2505/1/012009.

    Article  Google Scholar 

  63. Shamim HM, Rahman MF, Uddin MK, Hossain MK. Customer sentiment analysis and prediction of halal restaurants using machine learning approaches. J Islam Market. 2023;14(7):1859–89. https://doi.org/10.1108/JIMA-04-2021-0125.

    Article  Google Scholar 

  64. Hu J, Szymczak S. A review on longitudinal data analysis with random forest. Brief Bioinform. 2023;24(2):002. https://doi.org/10.1093/bib/bbad002.

    Article  Google Scholar 

  65. Igual C, Castillo A, Igual J. An interactive training model for myoelectric regression control based on human-machine cooperative performance. Computers. 2024;13(1):29. https://doi.org/10.3390/computers13010029.

    Article  Google Scholar 

  66. Jadhav GG, Gaikwad SV, Bapat D. A systematic literature review: digital marketing and its impact on SMEs. J Ind Bus Res. 2023;15(1):76–91. https://doi.org/10.1108/JIBR-05-2022-0129.

    Article  Google Scholar 

  67. Jain PK, Pamula R, Srivastava G. A systematic literature review on machine learning applications for consumer sentiment analysis using online reviews. Comput Sci Rev. 2021;41(August):100413. https://doi.org/10.1016/j.cosrev.2021.100413.

    Article  Google Scholar 

  68. Jia Y, Feng H, Wang X, Alvarado M. “Customer reviews or vlogger reviews?” The impact of cross-platform ugc on the sales of experiential products on E-commerce platforms. J Theor Appl Electron Commer Res. 2023;18(3):1257. https://doi.org/10.3390/jtaer18030064.

    Article  Google Scholar 

  69. Jiang H, Sabetzadeh F, Chan KY. Developing nonlinear customer preferences models for product design using opining mining and multiobjective PSO-based ANFIS approach. Comput Intell Neurosci CIN. 2023. https://doi.org/10.1155/2023/6880172.

    Article  Google Scholar 

  70. Jlifi B, Abidi C, Duvallet C. Beyond the use of a novel ensemble based random forest-BERT model (Ens-RF-BERT) for the sentiment analysis of the hashtag COVID19 tweets. Soc Netw Anal Min. 2024;14(1):88. https://doi.org/10.1007/s13278-024-01240-x.

    Article  Google Scholar 

  71. Kakalejčík L, Bucko J, Vejačka M. Differences in buyer journey between high- and low-value customers of e-commerce business. J Theor Appl Electron Commer Res. 2019;14(2):47–58. https://doi.org/10.4067/S0718-18762019000200105.

    Article  Google Scholar 

  72. Kalita K, Burande D, Ghadai RK, Chakraborty S. Finite element modelling, predictive modelling and optimization of metal inert gas, tungsten inert gas and friction stir welding processes: a comprehensive review. Arch Comput Methods Eng. 2023;30(1):271–99. https://doi.org/10.1007/s11831-022-09797-6.

    Article  Google Scholar 

  73. Kamakura WA. Cross-selling. Relationsh Market. 2008;6(3–4):41–58. https://doi.org/10.1300/J366v06n03_03.

    Article  Google Scholar 

  74. Kapoor R, Kapoor K. The transition from traditional to digital marketing: a study of the evolution of e-marketing in the indian hotel industry. Worldw Hosp Tour Themes. 2021;13(2):199–213. https://doi.org/10.1108/WHATT-10-2020-0124.

    Article  Google Scholar 

  75. Kelley L. The 3 primary stages of the buyer’s journey. ImageSource. 2014;16(12):14.

    Google Scholar 

  76. Kepes S, McDaniel MA, Brannick MT, Banks GC. Meta-analytic reviews in the organizational sciences: two meta-analytic schools on the way to MARS (the meta-analytic reporting standards). J Bus Psychol. 2013;28(2):123–43. https://doi.org/10.1007/s10869-013-9300-2.

    Article  Google Scholar 

  77. Muzahid KM, Bashar I, Minhaj GM, Wasi AI, Hossain NUI. Resilient and sustainable supplier selection: an integration of SCOR 4.0 and machine learning approach. Sustain Resil Infrastruct. 2023;8(5):453–69. https://doi.org/10.1080/23789689.2023.2165782.

    Article  Google Scholar 

  78. Khan MA, Vivek SM, Minhaj MA, Saifi SA, Hasan A. Impact of store design and atmosphere on shoppers’ purchase decisions: an empirical study with special reference to Delhi-NCR. Sustainability. 2023;15(1):95. https://doi.org/10.3390/su15010095.

    Article  Google Scholar 

  79. Khan S, Rashid A, Rasheed R, Amirah NA. Designing a knowledge-based system (KBS) to study consumer purchase intention: the impact of digital influencers in Pakistan. Kybernetes. 2022;52(5):1720–44. https://doi.org/10.1108/K-06-2021-0497.

    Article  Google Scholar 

  80. Khanna P, Maheshwari S. Development of mathematical models for prediction and control of weld bead dimensions in MIG welding of stainless steel 409M’. In: materials today: proceedings, 7th international conference of materials processing and characterization, March 17–19, 2017, 2018; 5 (2, Part 1): 4475–88. https://doi.org/10.1016/j.matpr.2017.12.017.

  81. Khondakar MFK, Sarowar MH, Chowdhury MH, Majumder S, Hossain MA, Dewan MAA, Hossain QD. A systematic review on EEG-based neuromarketing: recent trends and analyzing techniques. Brain Inf. 2024;11(1):17. https://doi.org/10.1186/s40708-024-00229-8.

    Article  Google Scholar 

  82. Kim HJ, Jayakumar Venkat S, Chang HW, Cho YH, Lee JY, Koo K. A two-step approach to overcoming data imbalance in the development of an electrocardiography data quality assessment algorithm: a real-world data challenge. Biomimetics. 2023;8(1):119. https://doi.org/10.3390/biomimetics8010119.

    Article  Google Scholar 

  83. Kim J, Hui-Sang K, Sun-Yong C. Forecasting the S&P 500 Index using mathematical-based sentiment analysis and deep learning models: a FinBERT transformer model and LSTM. Axioms. 2023;12(9):835. https://doi.org/10.3390/axioms12090835.

    Article  Google Scholar 

  84. Kitchenham B. Procedures for performing systematic reviews. Keele: Keele Univ; 2004.

    Google Scholar 

  85. Kjell O, Giorgi S, Andrew Schwartz H. The text-package: an R-package for analyzing and visualizing human language using natural language processing and transformers. Psychol Methods. 2023. https://doi.org/10.1037/met0000542.

    Article  Google Scholar 

  86. Ko S-H, Hsieh M-C, Huang R-F. Human error analysis and modeling of medication-related adverse events in Taiwan using the human factors analysis and classification system and logistic regression. Healthcare. 2023;11(14):2063. https://doi.org/10.3390/healthcare11142063.

    Article  Google Scholar 

  87. Kukkar A, Mohana R, Sharma A, Nayyar A, Shah MA. Improving sentiment analysis in social media by handling lengthened words. IEEE Access. 2023;11:9775–88. https://doi.org/10.1109/ACCESS.2023.3238366.

    Article  Google Scholar 

  88. Kumar S, Singh P, Srivastava G, Singh S. Intelligent movie recommender framework based on content-based & collaborative filtering assisted with sentiment analysis. Int J Adv Res Comput Sci. 2023;14(3):108–13. https://doi.org/10.26483/ijarcs.v14i3.6979.

    Article  Google Scholar 

  89. Kyaw KS, Tepsongkroh P, Thongkamkaew C, Sasha F. Business intelligent framework using sentiment analysis for smart digital marketing in the E-commerce era. Asia Soc Issues. 2023;16(3):e252965–e252965. https://doi.org/10.48048/asi.2023.252965.

    Article  Google Scholar 

  90. Liang W, Luo S, Zhao G, Hao Wu. Predicting hard rock pillar stability using GBDT, XGBoost, and LightGBM algorithms. Mathematics. 2020;8(5):765. https://doi.org/10.3390/math8050765.

    Article  Google Scholar 

  91. Lim CV, Yu-Peng Z, Omar M, Han-Woo P. Decoding the relationship of artificial intelligence, advertising, and generative models. Digital. 2024;4(1):244. https://doi.org/10.3390/digital4010013.

    Article  Google Scholar 

  92. Liu D, Wang Y, Luo C, Ma J. An improved autoencoder for recommendation to alleviate the vanishing gradient problem. Knowl-Based Syst. 2023;263(March):110254. https://doi.org/10.1016/j.knosys.2023.110254.

    Article  Google Scholar 

  93. Liu M, Ying Q. The role of online news sentiment in carbon price prediction of china’s carbon markets. Environ Sci Pollut Res. 2023;30(14):41379–87. https://doi.org/10.1007/s11356-023-25197-0.

    Article  Google Scholar 

  94. Long Y, Huang L, Li Y, Quan W, Yoshida Y. Enlarged carbon footprint inequality considering household time use pattern. Environ Res Lett. 2024;19(4):044013. https://doi.org/10.1088/1748-9326/ad2d85.

    Article  Google Scholar 

  95. Ma J, Dhiman P, Qi C, Bullock G, van Smeden M, Riley RD, Collins GS. Poor handling of continuous predictors in clinical prediction models using logistic regression: a systematic review. J Clin Epidemiol. 2023;161(September):140–51. https://doi.org/10.1016/j.jclinepi.2023.07.017.

    Article  Google Scholar 

  96. Malodia S, Ferraris A, Sakashita M, Dhir A, Gavurova B. Can alexa serve customers better? AI-Driven voice assistant service interactions. J Serv Mark. 2022;37(1):25–39. https://doi.org/10.1108/JSM-12-2021-0488.

    Article  Google Scholar 

  97. Manikandan B, Rama P, Chakaravarthi S. A new fuzzy lexicon expansion and sentiment aware recommendation system in E-commerce. Int J Adv Comput Sci Appl. 2023. https://doi.org/10.14569/IJACSA.2023.0140629.

    Article  Google Scholar 

  98. Marcos AM, de Figueiredo B, de Coelho AFM. Service quality, customer satisfaction and customer value: holistic determinants of loyalty and word-of-mouth in services. TQM J. 2021;34(5):957–78. https://doi.org/10.1108/TQM-10-2020-0236.

    Article  Google Scholar 

  99. Mehmood S, Ahmad I, Khan F, Khan A. Sentiment analysis in social media for competitive environment using content analysis. Comput Mater Contin. 2022. https://doi.org/10.32604/cmc.2022.023785.

    Article  Google Scholar 

  100. Memon ZA, Munawar N, Kamal M. App store mining for feature extraction: analyzing user reviews. Acta Sci Technol. 2024. https://doi.org/10.4025/actascitechnol.v46i1.62867.

    Article  Google Scholar 

  101. Mgiba FM, Koopman A. The impact of motivation, attitude, quality, availability, and advertisement on the purchase intention for fashion clothing. Afr J Bus Econ Res. 2023;18(2):153–80. https://doi.org/10.31920/1750-4562/2023/v18n2a8.

    Article  Google Scholar 

  102. Mika B, Winczewski D. The work-on-demand platform as a part of monopoly capital: the example of a global ride-hailing company. Polish Sociol Rev. 2024;225:31–48. https://doi.org/10.26412/psr225.02.

    Article  Google Scholar 

  103. Mirfakhraei S, Abdolvand N, Rajaei S, Harandi. The RFMRv model for customer segmentation based on the referral value. Iran J Manag Stud. 2024;17(2):455–73. https://doi.org/10.22059/ijms.2023.329229.674722.

    Article  Google Scholar 

  104. Mohammed A, Kora R. A comprehensive review on ensemble deep learning: opportunities and challenges. J King Saud Univ Comput Inf Sci. 2023;35(2):757–74. https://doi.org/10.1016/j.jksuci.2023.01.014.

    Article  Google Scholar 

  105. Moher D, Shamseer L, Clarke M, Ghersi D, Liberati A, Petticrew M, Shekelle P, Lesley A, Stewart, and PRISMA-P Group. Preferred reporting items for systematic review and meta-analysis protocols (PRISMA-P) 2015 statement. Syst Rev. 2015;4(1):1. https://doi.org/10.1186/2046-4053-4-1.

    Article  Google Scholar 

  106. Mushtaq K, Zou R, Waris A, Yang K, Wang Ji, Iqbal J, Jameel M. Multivariate wind power curve modeling using multivariate adaptive regression splines and regression trees. PLoS ONE. 2023;18(8): e0290316. https://doi.org/10.1371/journal.pone.0290316.

    Article  Google Scholar 

  107. Mydyti H, Kadriu A, Bach MP. Using data mining to improve decision-making: case study of a recommendation system development. Organizacija. 2023;56(2):138–54. https://doi.org/10.2478/orga-2023-0010.

    Article  Google Scholar 

  108. Nagam VM. Internet use, users, and cognition: on the cognitive relationships between internet-based technology and internet users. BMC Psychol. 2023;11:1–9. https://doi.org/10.1186/s40359-023-01041-5.

    Article  Google Scholar 

  109. Natras R, Soja B, Schmidt M. Ensemble machine learning of random forest, AdaBoost and XGBoost for vertical total electron content forecasting. Remote Sens. 2022;14(15):3547. https://doi.org/10.3390/rs14153547.

    Article  Google Scholar 

  110. Nguyen MS. The influence of social media marketing on brand loyalty and intention to use among young vietnamese consumers of digital banking. Innov Market. 2023;19(4):1–13. https://doi.org/10.21511/im.19(4).2023.01.

    Article  Google Scholar 

  111. Chen N. Research on E-commerce database marketing based on machine learning algorithm. Comput Intell Neurosci CIN. 2022. https://doi.org/10.1155/2022/7973446.

    Article  Google Scholar 

  112. O’Croinin C, Guerra AG, Doschak MR, Löbenberg R, Davies NM. Therapeutic potential and predictive pharmaceutical modeling of stilbenes in cannabis sativa. Pharmaceutics. 2023;15(7):1941. https://doi.org/10.3390/pharmaceutics15071941.

    Article  Google Scholar 

  113. Oe H, Yamaoka Y, Ochiai H. Personal and emotional values embedded in thai-consumers’ perceptions: key factors for the sustainability of traditional confectionery businesses. Sustainability. 2023;15(2):1548. https://doi.org/10.3390/su15021548.

    Article  Google Scholar 

  114. Ounacer S, Mhamdi D, Ardchir S, Daif A, Azzouazi M. Customer sentiment analysis in hotel reviews through natural language processing techniques. Int J Adv Comput Sci Appl. 2023. https://doi.org/10.14569/IJACSA.2023.0140162.

    Article  Google Scholar 

  115. Paulo R, Vong C, Pinheiro F, Mimoso J. A sentiment analysis of michelin-starred restaurants. Eur J Manag Bus Econ. 2023;32(3):276–95. https://doi.org/10.1108/EJMBE-11-2021-0295.

    Article  Google Scholar 

  116. Petkovic J, Welch V, Tugwell P. PROTOCOL: do evidence summaries increase health policy-makers’ use of evidence from systematic reviews? A systematic review protocol. Campbell Syst Rev. 2017;13(1):1–18. https://doi.org/10.1002/CL2.178.

    Article  Google Scholar 

  117. Ping Y, Buoye A, Vakil A. Enhanced review facilitation service for C2C support: machine learning approaches. J Serv Mark. 2023;37(5):620–35. https://doi.org/10.1108/JSM-01-2022-0005.

    Article  Google Scholar 

  118. Pop R-A, Hlédik E, Dabija D-C. Predicting consumers’ purchase intention through fast fashion mobile apps: the mediating role of attitude and the moderating role of COVID-19. Technol Forecast Soc Chang. 2023;186(January):122111. https://doi.org/10.1016/j.techfore.2022.122111.

    Article  Google Scholar 

  119. Prasad GB, Keerthi MV, ChandanaAnjali O, Revathi. Sentiment analysis of customer product reviews using machine learning. Turk J Comput Math Educ. 2023;14(3):178–88.

    Google Scholar 

  120. Punetha N, Jain G. Aspect and orientation-based sentiment analysis of customer feedback using mathematical optimization models. Knowl Inf Syst. 2023;65(6):2731–60. https://doi.org/10.1007/s10115-023-01848-z.

    Article  Google Scholar 

  121. Rahman NA, Idrus SD, Adam NL. Classification of customer feedbacks using sentiment analysis towards mobile banking applications. IAES Int J Artif Intell. 2022;11(4):1579–87. https://doi.org/10.11591/ijai.v11.i4.pp1579-1587.

    Article  Google Scholar 

  122. Rahmani E, Khatami M, Stephens E. Using probabilistic machine learning methods to improve beef cattle price modeling and promote beef production efficiency and sustainability in Canada. Sustainability. 2024;16(5):1789. https://doi.org/10.3390/su16051789.

    Article  Google Scholar 

  123. Rajasa MC, Rahma F, Rachmadi RF, Pratomo BA, Purnomo MH. 2023. A review of imbalanced datasets and resampling techniques in network intrusion detection system. In: 2023 8th International conference on information technology and digital applications (ICITDA), 2023. pp. 1–6. https://doi.org/10.1109/ICITDA60835.2023.10427217.

  124. Ramos AP, Tanes RLV, Esplanada DE. Sentiment analysis in service quality of eugene’s villa of baler based on airbnb reviEWS. Quantum J Soc Sci Humanit. 2022;3(6):153–67. https://doi.org/10.55197/qjssh.v3i6.201.

    Article  Google Scholar 

  125. Rapa M, Ciano S, Orsini F, Tullo MG, Giannetti V, Mariani MB. Adoption of AI-based technologies in the food supplement industry: an Italian Start-Up case study. Systems. 2023;11(6):265. https://doi.org/10.3390/systems11060265.

    Article  Google Scholar 

  126. Razali NA, Mat NA, Malizan NA, Hasbullah MW, Zainuddin NM, Ishak KK, Ramli S, Sukardi S. Political security threat prediction framework using hybrid lexicon-based approach and machine learning technique. IEEE Access. 2023;11:17151–64. https://doi.org/10.1109/ACCESS.2023.3246162.

    Article  Google Scholar 

  127. Rivas P, Zhao L. Marketing with ChatGPT: navigating the ethical terrain of GPT-based chatbot technology. AI. 2023. https://doi.org/10.3390/ai4020019.

    Article  Google Scholar 

  128. Rubio-Aparicio M, Sanchez-Meca J, Fulgencio M-M, Lopez-Lopez JA. MARS (meta-analysis reporting standards). Anales de Psicol. 2018;34:412–20. https://doi.org/10.6018/analesps.34.2.320131.

    Article  Google Scholar 

  129. Sakalauskas V, Kriksciuniene D. Personalized advertising in E-commerce: using clickstream data to target high-value customers. Algorithms. 2024;17(1):27. https://doi.org/10.3390/a17010027.

    Article  Google Scholar 

  130. Salim SS, Ghanshyam AN, Ashok DM, Mazahir DB, Thakare BS. 2020. ‘Deep LSTM-RNN with Word Embedding for Sarcasm Detection on Twitter’. In: 2020 International Conference for Emerging Technology (INCET). 2020. pp. 1–4. https://doi.org/10.1109/INCET49848.2020.9154162.

  131. Santoni MM, Basaruddin T, Junus K. Convolutional neural network model based students’ engagement detection in imbalanced DAiSEE dataset. Int J Adv Comput Sci Appl. 2023. https://doi.org/10.14569/IJACSA.2023.0140371.

    Article  Google Scholar 

  132. Sarioğlu Cİ. The effect of customer perceptions concerning online shopping, viral marketing and customer loyalty on purchasing behaviour. Int J Manag EconBus. 2023;19(2):348–70. https://doi.org/10.17130/ijmeb.1210803.

    Article  Google Scholar 

  133. Sha Z, Cui Y, Xiao Y, Stathopoulos A, Contractor N, Fu Y, Chen W. A network-based discrete choice model for decision-based design. Design Sci. 2023. https://doi.org/10.1017/dsj.2023.4.

    Article  Google Scholar 

  134. Shah A, Kothari K, Thakkar U, Khara S. User review classification and star rating prediction by sentimental analysis and machine learning classifiers. In: Tuba M, Akashe S, Joshi A, editors. Information and communication technology for sustainable development. Advances in Intelligent Systems and Computing. Singapore: Springer; 2020. p. 279–88. https://doi.org/10.1007/978-981-13-7166-0_27.

    Chapter  Google Scholar 

  135. Shanmugavel AB, Ellappan V, Mahendran A, Subramanian M, Lakshmanan R, Mazzara M. A novel ensemble based reduced overfitting model with convolutional neural network for traffic sign recognition system. Electronics. 2023;12(4):926. https://doi.org/10.3390/electronics12040926.

    Article  Google Scholar 

  136. Sherbaz A, Konak BMK, Pezeshkpour P, Di Ventura B, Rapp BE. Deterministic lateral displacement microfluidic chip for minicell purification. Micromachines. 2022;13(3):365. https://doi.org/10.3390/mi13030365.

    Article  Google Scholar 

  137. Shini G, Srividhya V. Implicit aspect based sentiment analysis for restaurant review using LDA topic modeling and ensemble approach. Int J Adv Technol Eng Explor. 2023;10(102):554–68. https://doi.org/10.19101/IJATEE.2022.10100099.

    Article  Google Scholar 

  138. Singh G, Slack NJ, Sharma S, Aiyub AS, Ferraris A. Antecedents and consequences of fast-food restaurant customers’ perception of price fairness. Br Food J. 2022;124(8):2591–609. https://doi.org/10.1108/BFJ-03-2021-0286.

    Article  Google Scholar 

  139. Singh R, Singh R. Applications of sentiment analysis and machine learning techniques in disease outbreak prediction—a review. Mater Today Proc, Int Virtual Conf Sustain Mater. 2023;81(January):1006–11. https://doi.org/10.1016/j.matpr.2021.04.356.

    Article  Google Scholar 

  140. Singh U, Saraswat A, Azad HK, Abhishek K, Shitharth S. Towards improving e-commerce customer review analysis for sentiment detection. Sci Rep. 2022;12(1):21983. https://doi.org/10.1038/s41598-022-26432-3.

    Article  Google Scholar 

  141. Skinner D, Blake J. Modelling consumers Choice of Novel Food. PLoS ONE. 2023;18(8): e0290169. https://doi.org/10.1371/journal.pone.0290169.

    Article  Google Scholar 

  142. Skubleny D, Ghosh S, Spratlin J, Schiller DE, Rayat GR. Feature-specific quantile normalization and feature-specific mean-variance normalization deliver robust Bi-directional classification and feature selection performance between microarray and RNAseq Data. BMC Bioinform. 2024;25:1–14. https://doi.org/10.1186/s12859-024-05759-w.

    Article  Google Scholar 

  143. Sudirjo F, Ratnawati R, Hadiyati R, Sutaguna INT, Yusuf M. The influence of online customer reviews and E-service quality on buying decisions in electronic commerce. J Manag Creat Bus. 2023;1(2):156–81. https://doi.org/10.30640/jmcbus.v1i2.941.

    Article  Google Scholar 

  144. SunLuo HYE, Liu F, Lowe B. The advertisement puts me down, but i like it: examining an emerging type of audience-targeted negative advertisement. J Advert Res. 2023;63(2):160. https://doi.org/10.2501/JAR-2023-010.

    Article  Google Scholar 

  145. Susnjak T. Applying BERT and ChatGPT for sentiment analysis of lyme disease in scientific literature. arXiv. 2023. https://doi.org/10.48550/arXiv.2302.06474.

  146. Suyanto A, Femi SR. Analysis of the effect of impulsive purchase and service quality on customer satisfaction and loyalty in beauty E-commerce. Calit Acces La Success. 2023;24(194):18–28. https://doi.org/10.47750/QAS/24.194.03.

    Article  Google Scholar 

  147. Taherkhani L, Daneshvar A, Amoozad Khalili H, Sanaei MR. Analysis of the customer churn prediction project in the hotel industry based on text mining and the random forest algorithm. Adv Civil Eng. 2023. https://doi.org/10.1155/2023/6029121.

    Article  Google Scholar 

  148. Alamin TM, Islam MM, Uddin MA, Hasan KF, Sharmin S, Alyami SA, Moni MA. Machine learning-based network intrusion detection for big and imbalanced data using oversampling, stacking feature embedding and feature extraction. J Big Data. 2024;11(1):33. https://doi.org/10.1186/s40537-024-00886-w.

    Article  Google Scholar 

  149. Taralik K, Kozák T, Molnár Z. Channel preferences and attitudes of domestic buyers in purchase decision processes of high-value electronic devices. Entrep Bus Econ Rev. 2023;11(2):121–36. https://doi.org/10.15678/EBER.2023.110206.

    Article  Google Scholar 

  150. Tavares MC, Azevedo G, Marques RP. The challenges and opportunities of era 5.0 for a more humanistic and sustainable society—a literature review. Societies. 2022;12(6):149. https://doi.org/10.3390/soc12060149.

    Article  Google Scholar 

  151. Tay Y, Tuan LA, Hui SC, Su J. Reasoning with Sarcasm by Reading In-Between. arXiv. 2018. https://doi.org/10.48550/arXiv.1805.02856.

  152. Thangeda R, Kumar N, Majhi R. A neural network-based predictive decision model for customer retention in the telecommunication sector. Technol Forecast Soc Chang. 2024;202(May):123250. https://doi.org/10.1016/j.techfore.2024.123250.

    Article  Google Scholar 

  153. Torkzadeh S, Zolfagharian M, Yazdanparast A, Gremler DD. From customer readiness to customer retention: the mediating role of customer psychological and behavioral engagement. Eur J Mark. 2022;56(7):1799–829. https://doi.org/10.1108/EJM-03-2021-0213.

    Article  Google Scholar 

  154. Tuncer I, Unusan C, Cobanoglu C. Service quality, perceived value and customer satisfaction on behavioral intention in restaurants: an integrated structural model. J Qual Assur Hosp Tour. 2021;22(4):447–75. https://doi.org/10.1080/1528008X.2020.1802390.

    Article  Google Scholar 

  155. Ullah A, Khan K, Khan A, Ullah S. Understanding quality of products from customers’ attitude using advanced machine learning methods. Computers. 2023;12(3):49. https://doi.org/10.3390/computers12030049.

    Article  MathSciNet  Google Scholar 

  156. Vásquez FGZ, Poveda DAM, Llerena WVL. Big data and its implication in marketing. Rev de Comun de La SEECI. 2023;56:302–19. https://doi.org/10.15198/seeci.2023.56.e832.

    Article  Google Scholar 

  157. Veloso CM, Sousa BB. Drivers of customer behavioral intentions and the relationship with service quality in specific industry contexts. Int Rev Retail, Distrib Consum Res. 2022;32(1):43–58. https://doi.org/10.1080/09593969.2021.2007977.

    Article  Google Scholar 

  158. Veseli-Kurtishi T, Ruci E. The impact of digital marketing on the development of tourism in Republic of Albania. Eurasian J Soc Sci. 2023;11(1):1–11. https://doi.org/10.15604/ejss.2023.11.01.001.

    Article  Google Scholar 

  159. Wang Lu, Zhang Y, Chignell M, Shan B, Sheehan M, Razak F, Verma A. Boosting delirium identification accuracy with sentiment-based natural language processing: mixed methods study. JMIR Med Inform. 2022;10(12): e38161. https://doi.org/10.2196/38161.

    Article  Google Scholar 

  160. Wang Q, Tingxuan Su, Lau RYK, Xie H. DeepEmotionNet: emotion mining for corporate performance analysis and prediction. Inf Process Manage. 2023;60(3):103151. https://doi.org/10.1016/j.ipm.2022.103151.

    Article  Google Scholar 

  161. Wang S, Ma J. A novel GBDT-BiLSTM Hybrid model on improving day-ahead photovoltaic prediction. Sci Rep (Nat Publ Gr). 2023;13(1):15113. https://doi.org/10.1038/s41598-023-42153-7.

    Article  MathSciNet  Google Scholar 

  162. Wang S, Li C, Kankan Z, Chen H. Context-aware recommendations with random partition factorization machines. Data Sci Eng. 2017;2(2):125–35. https://doi.org/10.1007/s41019-017-0035-3.

    Article  Google Scholar 

  163. Wang Y, Shi Q, Chang TH. Why batch normalization damage federated learning on non-IID data? arXiv. 2023. https://doi.org/10.48550/arXiv.2301.02982.

  164. Wen N, Liu G, Zhang J, Zhang R, Yating Fu, Han Xu. A Fingerprints based molecular property prediction method using the BERT model. J Cheminf. 2022;14(1):71. https://doi.org/10.1186/s13321-022-00650-3.

    Article  Google Scholar 

  165. Wen Z, Lin W, Liu H. Machine-learning-based approach for anonymous online customer purchase intentions using clickstream data. Systems. 2023;11(5):255. https://doi.org/10.3390/systems11050255.

    Article  Google Scholar 

  166. Xiong T, Zhang P, Zhu H, Yang Y. Sarcasm detection with self-matching networks and low-rank bilinear pooling. 2019. pp. 2115–24. https://doi.org/10.1145/3308558.3313735.

  167. Xu B, Tan Y, Sun W, Ma T, Liu H, Wang D. Study on the prediction of the uniaxial compressive strength of rock based on the SSA-XGBoost model. Sustainability. 2023;15(6):5201. https://doi.org/10.3390/su15065201.

    Article  Google Scholar 

  168. Yang L, Zhang He, Shen H, Huang X, Zhou X, Rong G, Shao D. Quality assessment in systematic literature reviews: a software engineering perspective. Inf Softw Technol. 2021;130(February):106397. https://doi.org/10.1016/j.infsof.2020.106397.

    Article  Google Scholar 

  169. Yang Z, Brattin R, Sexton R, Stalnaker JL. Social media usage and customer loyalty: predicting returning customers using artificial neural network. Int J Inf Bus Manag. 2022;14(3):18–28.

    Google Scholar 

  170. Yoon HJ, Huang Y, Yim M-C. Native advertising relevance effects and the moderating role of attitudes toward social networking sites. J Res Interact Mark. 2022;17(2):215–31. https://doi.org/10.1108/JRIM-07-2021-0185.

    Article  Google Scholar 

  171. Yu W, Liang Y, Zhu X. Sentiment analysis of hotel online reviews using the BERT model and ERNIE model—data from China. PLoS ONE. 2023;18(3): e0275382. https://doi.org/10.1371/journal.pone.0275382.

    Article  Google Scholar 

  172. Yue W, Li L. Sentiment analysis using a CNN-BiLSTM deep model based on attention classification. Int Inf Inst Inf. 2023;26(3):117–62. https://doi.org/10.47880/inf2603-02.

    Article  Google Scholar 

  173. Zanoni M, Chiumeo R, Tenti L, Volta M. What else do the deep learning techniques tell us about voltage dips validity? Regional-level assessments with the new QuEEN system based on real network configurations. Energies. 2023;16(3):1189. https://doi.org/10.3390/en16031189.

    Article  Google Scholar 

  174. Zhang C, Fan H, Zhang J, Yang Q, Tang L. Topic discovery and hotspot analysis of sentiment analysis of chinese text using information-theoretic method. Entropy. 2023;25(6):935. https://doi.org/10.3390/e25060935.

    Article  Google Scholar 

  175. Zhang M, Lu J, Ma N, Cheng TCE, Hua G. A Feature engineering and ensemble learning based approach for repeated buyers prediction. Int J Comput Commun Control. 2022. https://doi.org/10.15837/ijccc.2022.6.4988.

    Article  Google Scholar 

  176. Zhang PV, Kim S, Chakravarty A. Influence of pull marketing actions on marketing action effectiveness of multichannel firms: a meta-analysis. J Acad Mark Sci. 2023;51(2):310–33. https://doi.org/10.1007/s11747-022-00877-4.

    Article  Google Scholar 

  177. Zhang R, Chen M. Predicting online shopping intention: the theory of planned behavior and live E-commerce. SHS Web Conf. 2023;155:02008. https://doi.org/10.1051/shsconf/202315502008.

    Article  Google Scholar 

  178. Zhang R, Jun M, Palacios S. M-shopping service quality dimensions and their effects on customer trust and loyalty: an empirical study. Int J Qual Reliab Manag. 2023;40(1):169–91. https://doi.org/10.1108/IJQRM-11-2020-0374.

    Article  Google Scholar 

  179. Zhang Z, Jung C. GBDT-MO: gradient boosted decision trees for multiple outputs. IEEE. 2019. https://doi.org/10.1109/TNNLS.2020.3009776.

    Article  Google Scholar 

  180. Zolfaghari B, Mirsadeghi L, Bibak K, Kavousi K. Cancer prognosis and diagnosis methods based on ensemble learning. ACM Comput Surv. 2023;55(12):262:1-262:34. https://doi.org/10.1145/3580218.

    Article  Google Scholar 

  181. Zou H, Wang Z. A semi-supervised short text sentiment classification method based on improved bert model from unlabelled data. J Big Data. 2023;10(1):35. https://doi.org/10.1186/s40537-023-00710-x.

    Article  Google Scholar 

Download references

Funding

Not applicable.

Author information

Authors and Affiliations

Authors

Contributions

VG: methodology, conceptualization, implementation, and writing; TI, SHR, and BAS: methodology, writing and editing, and supervision. All authors have read and approved the final manuscript.

Corresponding author

Correspondence to Bilal Abu-Salih.

Ethics declarations

Ethics approval and consent to participate

Not Applicable.

Consent for publication

Not Applicable.

Competing interests

The authors declare no conflicts of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Gooljar, V., Issa, T., Hardin-Ramanan, S. et al. Sentiment-based predictive models for online purchases in the era of marketing 5.0: a systematic review. J Big Data 11, 107 (2024). https://doi.org/10.1186/s40537-024-00947-0

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1186/s40537-024-00947-0

Keywords