Context-aware rule learning from smartphone data: survey, challenges and future directions

Sarker, Iqbal H.

doi:10.1186/s40537-019-0258-4

Survey Paper
Open access
Published: 31 October 2019

Context-aware rule learning from smartphone data: survey, challenges and future directions

Iqbal H. Sarker^1,2

Journal of Big Data volume 6, Article number: 95 (2019) Cite this article

8210 Accesses
70 Citations
Metrics details

Abstract

Smartphones are considered as one of the most essential and highly personal devices of individuals in our current world. Due to the popularity of context-aware technology and recent developments in smartphones, these devices can collect and process raw contextual data about users’ surrounding environment and their corresponding behavioral activities with their phones. Thus, smartphone data analytics and building data-driven context-aware systems have gained wide attention from both academia and industry in recent days. In order to build intelligent context-aware applications on smartphones, effectively learning a set of context-aware rules from smartphone data is the key. This requires advanced data analytical techniques with high precision and intelligent decision making strategies based on contexts. In comparison to traditional approaches, machine learning based techniques provide more effective and efficient results for smartphone data analytics and corresponding context-aware rule learning. Thus, this article first makes a survey on previous work in the area of contextual smartphone data analytics and then presents a discussion of challenges and future directions for effectively learning context-aware rules from smartphone data, in order to build rule-based automated and intelligent systems.

Introduction

In recent days, smartphones have become an essential part of our daily life and considered as highly personal devices of individuals. These devices are also known as one of the most important IoT (Internet of Things) devices, because of their capabilities to interconnect their users with the Internet, and corresponding data processing [1]. Smartphones are also considered as “next generation, multifunctional cell phones that facilitates data processing as well as enhanced wireless connectivity” [2]. The cellular network coverage has reached 96.8% of the world population, and this number even reaches 100% of the population in the developed countries [3]. In recent statistics, according to Google Trends [4] we have shown in Fig. 1, that users’ interest on “Mobile Phones” is more and more than other platforms like “Desktop Computer”, “Laptop Computer” or “Tablet Computer” for the last 5 years from 2014 to 2019. Figure 1 represents timestamp information in terms of particular date in x-axis and corresponding search interests in the range of 0 to 100 in terms of popularity relative to the highest point on the chart in y-axis. For instance, a value of 100 (maximum) in y-axis represents the peak popularity for a particular term, while 0 (minimum) means the term was lowest in terms of popularity [4].

Due to the advanced features and recent developments in smartphones, these devices can collect raw contextual data about users’ surrounding environment and their corresponding behavioral activities with their phones in a daily basis [5]. As a result, smartphone data becomes a great source to understand users’ behavioral activity patterns in different contexts, and to derive useful information, i.e., context-aware rules, for the purpose of building rule-based intelligent context-aware systems. A context-aware rule has two parts, which follows “IF-THEN” logical structure to formulate [6]. The antecedent part represents users’ surrounding contextual information, e.g., temporal context, spatial context, social contexts, or others relevant contextual information and the consequent part represents their corresponding behavioral activities or usage. Let’s consider an example of a context-aware mobile notification management system for a smartphone user Alice. A context-aware rule for such system could be “The user typically dismisses mobile notifications while at work; however, accepts the notifications in the evening from her family members, even though she is in work”. A set of such context-aware behavioral rules including general and specific exceptions, may vary from user-to-user according to their preferences. In addition to the personalized services mentioned above, the relevant context-aware rules in different surrounding contexts could be applicable to other broad application areas, like context-aware software and IoT services, intelligent eHealth services, and context-aware smart city services, intelligent cybersecurity services etc. utilizing the relevant contextual data of that particular domain. Overall, this study is typically for those data science and machine learning researchers, and practitioners who particularly want to work on data-driven intelligent context-aware systems and services based on machine learning rules.

Effectively learning context-aware rules from smartphone data is challenging because of many reasons, ranging from understanding raw data to applications. A number of research [7,8,9] has been done on mining context-aware rules from smartphone data for various purposes. However, to effectively learn such rules for the purpose of building intelligent context-aware systems, a deeper analysis in contextual data patterns and learning according to individuals’ usage is needed. Thus, advanced data analysis based on machine learning techniques, can be used to make effective and efficient decision-making capabilities in different context-aware test cases for smartphones. Several machine learning and data mining techniques, such as contextual data clustering, feature optimization and selection, rule-based classification and association analysis, incremental learning for dynamic updating and management, and corresponding rule-based prediction model can be designed to provide smartphone data analytic solutions. The reason is that such machine learning techniques can be more accurate, and more precise for analyzing huge amount of contextual data. The aim of these advanced analytic techniques is to discover information, hidden patterns, and unknown correlations among the contexts and eventually generate context-aware rules. For instance, a detailed analysis of time-series data and corresponding data clustering based on similar behavioral patterns, could lead to capture the diverse behaviors of an individual’s activities, thereby enabling more optimal time-based context-aware rules than the traditional approaches [10]. Thus, intelligent data-driven decisions using machine learning techniques can profit better decision making capability over the traditional approaches while considering the multi-dimensional contexts.

Based on our survey and analysis on existing research, little work has been done in terms of how machine learning techniques significantly impact on contextual smartphone data and to learn corresponding context-aware rules. To address this shortcoming, this article first makes a survey on previous work in the area of contextual smartphone data analytics in several perspectives involved in context-aware rules, such as time-series modeling that is also known as a discretization of temporal context, rule discovery techniques, and incremental learning and rule updation techniques, which has been highlighted in our earlier work [6]. After that this article presents a brief discussion on challenges and future directions to overcome these issues. Based on our discussion, finally we suggest a machine learning based context-aware rule learning framework for the purpose of effectively learning context-aware rules from smartphone data, in order to build rule-based automated and intelligent systems.

The contributions of this paper are summarized as follows.

We first make a brief survey on previous work in the area of smartphone data analytics in several perspectives related to context-aware rule learning and summarize the shortcomings of these research.
We then present a brief discussion on the challenges and future directions to overcome the issues to learn context-aware rules from smartphone data.
Finally, we suggest a machine learning based context-aware rule learning framework and briefly discuss the role of various layers associated with the framework, for the purpose of building rule-based intelligent context-aware systems.

To the best of our knowledge, this is the first article surveying context-aware rule learning strategies from smrtphone data. The remainder of the paper is organized as follows. “Background: contexts and smartphone data” section presents background information on contexts and contextual smartphone data. “Context-aware rule learning strategies” section surveys previous work in various perspectives related to context-aware rule learning. “Challenges and future directions” section briefly discusses the challenges and future directions of research regarding context-aware rule learning from smartphone data. In “Suggested machine learning based framework” section we suggest a machine learning based context-aware rule learning framework and discuss various layers with their roles while learning rules. Context-aware rule based applications section summarizes a number of real world applications based on context-aware rules. Finally, “Conclusion” section concludes this paper.

Background: contexts and smartphone data

This section reviews background information on the main characteristics of contexts and contextual smartphone data that address learning context-aware rules for the purpose of building rule-based intelligent systems.

Characteristics of contexts

The term context can be used with a variety of different meanings in different purposes. The notion of context has been used in numerous areas, including Pervasive and Ubiquitous Computing, Human Computer Interaction, Computer-Supported Collaborative Work, and Ambient Intelligence [11]. In this section, first we briefly review what is context in the area of mobile and context-aware computing. In Ubiquitous and Pervasive Computing area, early works on context-awareness referred to context as primarily the location of people and objects [12]. In recent works, context has been extended to include a broader collection of factors, such as physical and social aspects of an entity, as well as the activities of users [11]. Having examined the definitions and categories of context given by the pervasive and ubiquitous computing community, this section seeks to define our view of context within the scope of smartphone data analytics. As the definitions of context to pervasive and ubiquitous computing area are also broad, this discussion is intended to be illustrative rather than exhaustive.

Several studies have attempted to define and represent the context from different perspectives. For instance, the user’s location information, the surrounding people and objects around the user, and the changes to those objects are considered as contexts by Schilit et al. [12]. Brown et al. [13] also define contexts as user’s locational information, temporal information, the surrounding people around the user, temperature, etc. Similarly, the user’s locational information, environmental information, temporal information, user’s identity, are also taken into account as contexts by Ryan et al. [14]. Other definitions of context have simply provided synonyms for context such as context as the environment or social situation. A number of researchers are taken into account the context as the environmental information of the user. For instance, in [15], the environmental information that the user’s computer knows about are taken into account as context by Brown et al., whereas the social situation of the user is considered as a context in Franklin et al. [16]. On the other hand, a number of other researchers consider it to be the environment related to the applications. For instance, Ward et al. [17] consider the state of the surrounding information of the applications as contexts. Hull et al. [18] define context as the aspects of the current situation of the user and include the entire environment. The settings of applications are also treated as context in Rodden et al. [19].

According to Schilit et al. [20] the important aspects of context are: (i) where you are, (ii) whom you are with, and (iii) what resources are nearby. The information of the changing environment is taken into account as context in their definition. In addition to the user environment (e.g., user location, nearby people around the user, and the current social situation of the user), they also include the computing environment and the physical environment. For instance, connectivity, available processors, user input and display, network capacity, and costs of computing can be the examples of the computing environment, while the noise level, temperature, the lighting level, can be the examples of the physical environment. Dey et al. [21] present a survey of alternative view of context, which are largely imprecise and indirect, typically defining context by synonym or example. Finally, they offer the following definition of context, which is perhaps now the most widely accepted. According to Dey et al. [21] “Context is any information that can be used to characterize the situation of an entity. An entity is person, place or object that is considered relevant to the interaction between a user and an application, including the user and the application themselves”. Thus, based on the definition of Dey et al. [21], we can define context in the scope of this work as “Context is any information that can be used to characterize users’ day-to-day situations that have an influence on their smartphone usage”. An example of relevant contexts could be temporal context, spatial context, or social context etc. that might have an influence to make individuals’ diverse decisions on smartphone usage in their daily life activities.

Contextual smartphone data

We live in the age of data [22], where everything that surrounds us is linked to a data source and everything in our lives is captured digitally. Mobile or cellular phones have become increasingly ubiquitous and powerful to log user diverse activities for understanding their preferences and phone usage behavior. For instance, smart mobile phones have the ability to log various types of context data related to a user’s phone call activities about when the user makes outgoing calls, or accepts, rejects, and misses the incoming calls [23,24,25,26]. In addition to such call related meta data, other dimensions of contextual information such as user location [27], user’s day-to-day situation [28], the social relationship between the caller an callee identified by the individual’s unique phone contact number [29] are also recorded by the smart mobile phones. Thus, call log data collected by the smart mobile phone can be used as a context source to modeling individual mobile phone user behavior in smart context-aware mobile communication systems [30]. In addition to voice communication, short message service (SMS) is known as text communication service allows the exchange of short text messages of individual mobile phone users, using standardized communications rules or protocols. According to the International Telecommunication Union [31], short messages have become a massive commercial industry, worth over 81 billion dollars globally. The numerous growth in the number of mobile phone users in the world has lead to a dramatic increasing of spam messages [32]. The SMS log contains all the message including the spam and non-spam text messages [32, 33], which can be used in the task of automatic spam filtering [25, 32], or predicting good time or bad time to deliver such messages [33].

With the rapid development of smartphones, people use these devices for using various categories of apps such as Multimedia, Facebook, Gmail, Youtube, Skype, Game [9, 34]. Thus, smartphone apps log contains these usage with relevant contextual information [8, 9, 35,36,37]. Such logs can be used for mining the contextual behavioral patterns of individual mobile phone users that is, which app is preferred by a particular user under a certain context to provide personalized context-aware recommendation. In the real world, a variety of smart mobile applications use notifications in order to inform the users about various kinds of events, news or just to send them reminders or alerts. For instance, the notifications of inviting games on social networks, social or promotional emails, or a number of predictive suggestions by various smart phone applications, e.g., Twitter, Facebook, LinkedIN, WhatsApp, Viver, Skype, Youtube [7]. The extracted contextual patterns from smartphone notification logs can be used to build intelligent mobile notification management systems according to their preferences.

User navigation in the web in another major activities of individual users. Thus, web log contains the information about user mobile web navigation, web searching, e-mail, entertainment, chat, misc, news, TV, netting, travel, sport, banking, and related contextual information [38,39,40]. Mining contextual usage patterns from such log data, can be used to make accurate context-aware predictions about user navigation and to adapt the portal structure according to the needs of users. Similarly, game log contains the information about playing various types such games such as action, adventure, casual, puzzle, RPG, strategy, sports etc. of individual mobile phone users, and related contextual information [41]. The extracted contextual patterns from such logs data, can be used to build personalized mobile game recommendation system for individual mobile phone users according to their own preferences.

The ubiquity of smart mobile phones and their computing capabilities for various real life purposes provide an opportunity of using these devices as a life-logging device, i.e., personal e-memories [42]. In a more technical sense, life-logs sense and store individual’s contextual information from their surrounding environment through a variety of sensors available in their smart mobile phones, which are the core components of life-logs such as user phone calls, SMS headers (no content), App use (e.g., Skype, Whatsapp, Youtube etc.), physical activities form Google play API, and related contextual information such as WiFi and Bluetooth devices in user’s proximity, geographical location, temporal information [42]. The extracted contextual patterns or behavioral rules of individual mobile phone users utilizing such life log data, can be used to improve user experience in their daily life. In addition to these personalized log data, smartphones are also capable for collecting and processing IoT data [1]. Based on such smartphone data having contextual information, in this paper, we briefly review the existing rule learning strategies and discuss the open challenges and opportunities by highlighting future directions for context-aware rule learning.

Context-aware rule learning strategies

In this section, we review existing strategies related to learning rules based on contextual information in various perspectives. This includes time-series modeling that creates behavioral data clusters for generating temporal context based rules, contextual rule discovery by taking into account multi-dimensional contexts, such as temporal, spatial or social contexts, and incremental learning to dynamic updating of rules.

Modeling time-series smartphone data

Time is the most important context that impacts on mobile user behavior for making decisions [38]. Individual’s behaviors vary over time in the real world and the mobile phones record the exact time of all diverse activities of the users with their mobile phones. A time series is a sequence of data points ordered in time [43]. However, to use such time-series data into behavioral rules, an effective modeling of temporal context is needed. Thus, time-series segmentation becomes one of the research focuses in this study as exact time in mobile phone data is not very informative to mine behavioral rules of individual mobile phone users. According to [44], time-based behavior modeling is an open problem. Hence, we summarize the existing time-series segmentation approaches into two broad categories; (i) static segmentation, and (ii) dynamic segmentation, that are used in various mobile applications.

Static segmentation

A static segmentation is easy to understand and can be useful to analyze population behavior comparing across the mobile phone users. In order to generate segments, recently, most of the researchers (shown in Table 1) take into account only the temporal coverage (24-h-a-day) and statically segment time into arbitrary categories (e.g., morning) or periods (e.g., 1 h). Such static segmentation of time mainly focuses on time intervals. According to [45], there are mainly two types of time intervals: one is equal and another one is unequal. For instance, four different time segments, i.e., morning [6:00–12:00], afternoon [12:00–18:00], evening [18:00–24:00] and night [0:00–6:00] can be an example of equal interval based segmentation because of their same interval length. On the other hand, another four time slots such as morning [6:00–12:00], afternoon [12:00–16:00], evening [16:00–20:00] and night [20:00–24:00 and 0:00–6:00] can be an example of unequal interval based segmentation. For this example, different lengths of time interval are used to do the segmentation. In Table 1, we have summarized a number of works that use static segmentation considering either equal or unequal time interval in various purposes.

Table 1 Various types of static time segments used in different applications

Full size table

Although, various time intervals and corresponding segmentation summarized in Table 1 are used in different purposes, these approaches take into account a fixed number of segments for all users. However, while performing such segmentation users’ behavioral evidence that differs from user-to-user over time in the real world, is not taken into account. Thus, these static generation of segments may not suitable for producing high confidence temporal rules for individual smartphone users. For instance, $N_1$ number of segments might give meaningful results for one case, while $N_2$ number of segments could give better results for another case, where $N_1 \ne N_2$. Therefore, a dynamic segmentation of time rather than statically generation could be able to reflect individuals’ behavioral evidence over time and can play a role to produce high confidence rules according to their usage records.

Dynamic segmentation

As discussed above, a segmentation technique that generates variable number of segments would be more meaningful to model users’ behavior. Thus, dynamic segmentation technique rather than static segmentation can be used in order to achieve the goal. In a dynamic segmentation, the number of segments are not fixed and predefined; may change depending on their behavioral characteristics, patterns or preferences. Several dynamic segmentation techniques in terms of generating variable number of segments exist for modeling users’ behavioral activities in temporal contexts. A number of authors simply take into account a single parameter, e.g., interval length or base period, to generate the segments. The number of time segments varies according to this period. If $T_{max}$ represents the whole time period of 24-h-a-day and BP is a base period, then the number of segments will be $T_{max} / BP$ [10]. If the base period increases, the number of time segments decreases and vice-versa. For instance, if the base period is 5 min, then the number of segments will be the division result of 24-h-a-day and 5. In this example, a base period, e.g., 5 min, is assumed as the finest granularity to distinguish day-to-day activities of an individual. If the base period incremented to 15 min, then the number of segments decreases, where 15 min can be assumed as the finest granularity. Thus the number of segments varies based on the base time period. Similarly, individuals’ calendar schedules and corresponding time boundaries can also be used to determine variable length of time segments, in order to model users’ behavior in temporal context, which may vary according to users’ preferences [59]. For instance, one user may have a particular event between 1 and 2 p.m., while another may have in another time boundary between 1:30 and 2:30 p.m.. Thus, the time segmentation varies according to their daily life activities scheduled in their personal calendars. Similarly, multiple thresholds, sliding window, data shape based approaches are used in several applications, shown in Table 2. In addition to these approaches, a number of authors use machine learning techniques such as clustering, genetic algorithm etc. In Table 2, we have summarized a number of works that use such type of dynamic segmentation techniques in various purposes.

Table 2 Various types of dynamic time segments used in different applications

Full size table

Clustering highlighted in Table 2 is one of the important machine learning techniques in forming large time segments where certain user behavior patterns are taken into account. Usually, clustering algorithms are designed with certain assumptions and favor certain type of problems. In this sense, it is not accurate to say ‘best’ in the context of clustering algorithms; it depends on specific application [75]. Among the clustering algorithms the K-means algorithm is the best-known squared error-based clustering algorithm [76]. However, this algorithm needs to specify the initial partitions and fixed number of clusters K. The convergence centroids also vary with different initial points. Sometimes this algorithm is influenced by outliers because of mean value calculation. More importantly, the characteristic of this algorithm might not be directly applicable for the purpose of learning context-aware rules. The reason is that users' behave differently in different contexts, which also may vary from user-to-user in the real world. Thus, it's difficult to assume a number of clusters K to capture their diverse behaviors effectively. Another similar K-medoids method [77] is more robust than K-means algorithm in the presence of outliers because a medoid is less influenced by outliers than a mean. Though it minimizes the outlier problem but the other characteristic mismatches exist between K-means and the problem of time-series modeling.

As the size and number of time segments depend on the user’s behavior and it differs from user-to-user, a bottom-up hierarchical data processing can help to make behavioral clusters. Existing hierarchical algorithms are mainly classified as agglomerative methods and device methods. However, the device clustering method is not commonly used in practice [75]. The simplest and most popular agglomerative clustering is single linkage [78] and complete linkage [79]. Another method, nearest neighbor [75], is also similar to the single linkage agglomerative clustering algorithm. All these hierarchical algorithms use a proximity matrix which is generated by computing the distance between a new cluster and other clusters. Then according to the matrix value these algorithms successively merge the clusters until the desired cluster structure is obtained. However, it is not possible to predict the level at which the merging is best according to a proximity matrix because of the variations in users’ behavior. Thus applying such clustering techniques could generate the segments according to users’ behavioral patterns available in time-series. Similarly, genetic algorithm based approaches shown in Table 2 also produce dynamic segments.

In a summary, we can conclude that time-series modeling in terms of both static segmentation and dynamic segmentation approaches discussed above, are able to generate various time segments that can be used in different purposes. However, the above time-series modeling approaches do not necessarily map to the patterns of individuals’ usage according to their preferences, which is based on users’ diverse behaviors over time-of-the-week and may vary from user to user. A machine learning based behavior-oriented dynamic time-series modeling technique by taking into account such patterns, could be significant in order to effectively use temporal context as the basis for discovering rules capturing smartphone usage behavior.

Rule discovery

Another major issue focus in this study is discovering useful behavioral rules of individual mobile phone users based on multi-dimensional contexts, such as temporal, spatial, or social contexts, utilizing their smartphone data. In the area of machine learning both association rule learning [80] and classification rule learning [81] are the most common techniques to discover such type of rules of individual mobile phone users. In the following, we give an brief overview of both association and classification techniques for the purpose of discovering rules based on multi-dimensional contexts.

Association rules

Association rule learning algorithm discovers association rules that satisfy the predefined minimum support and confidence constraints from a given dataset [80]. Many association rule learning algorithms have been proposed in the data mining literature, such as logic based [82], frequent pattern based [80, 83, 84], tree-based [85] etc. Association rule learning technique is well defined in terms of rule’s performance, e.g., accuracy, and flexibility as it has the own parameter support and confidence [86]. A number of researchers [7,8,9] have used association rule learning technique (e.g., Apriori) [80] to mine rules capturing mobile phone users’ behavior. However, the existing association rule learning techniques might not be suitable for discovering users’ behavioral rules because of several reasons. In the following, we summarize the drawbacks of association rules for discovering the behavioral rules of individual mobile phone users by taking into account multi-dimensional contexts.

Lacking in understanding the impact of contexts Different contexts in mobile phone data, such as temporal, spatial or social context, may have different impact or influence on the behavioral rules of individual mobile phone users. For instance, incoming phone calls from a significant person, e.g., mother, is always answered by an individual, even though she is in a meeting because of her family priority. In this case, the importance of social relationship between individuals ($social \; relationship \rightarrow mother$) in making behavioral decision, is higher than other relevant contexts such as time period, weekday or holiday, location, accompany with etc. However, the typical association rule learning technique implicitly assumes all the contexts in the datasets have the similar nature, and/or impact while discovering rules based on multi-dimensional contexts.

Redundancy Association rule learning technique, e.g., Apriori, discovers all the contextual associations in a given dataset, if it satisfies the user preference, specified as minimum support value and minimum confidence value. As a result, association rule learning technique produces a huge number of redundant rules as it does not take into account the usefulness of a particular context or corresponding patterns while producing the associations. For instance, it produces up to 83% redundant rules for a given dataset that makes the rule-set unnecessarily large [87]. Therefore, it is very difficult to determine the most interesting ones among the huge amount of rules generated. As a result, it makes the rule-based decision making process ineffective and more complex, which is not effective to build a context-aware intelligent system [88].

Computational complexity and high training time In order to produce rules, association rule learning technique takes huge amount of training time. For instance, in an experimental study in mobile phone domain, the authors observe a high running time spanning several hours when the association rule learning algorithm is used to discover user behavioral rules [8]. The main reason for taking high training time is that typical association techniques compute all the possible associations among contexts and are unable to filter the interesting rules that can be used to make effective decisions. As a result the unnecessary generation of patterns increases the computational complexity and training time.

In a summary, by taking into account the impact of contexts, redundancy problem while generating rules, and computational complexity, typical association rule learning techniques may not suitable to produce users’ behavioral rules in multi-dimensional contexts, for the purpose of building intelligent context-aware systems.

Classification rules

Classification is another technique to discover user behavioral rules from the datasets. Several classification algorithms exist with the ability of rule generation like ZeroR [89], OneR [90], RIDOR [89], RIPPER [91], PART [92], DTNB [93], Decision Trees [81, 94] etc. Among these techniques, decision tree is one of the most popular rule-based classification algorithms as it has several advantages, such as easier to interpret; the ability to handle high dimensional data; simplicity and speed; good accuracy; and the ability to generate human understandable classification rules [95, 96]. In particular, a number of authors [67, 97,98,99,100] have used decision tree classification technique to discover rules capturing mobile phone users’ behavior for various purposes. However, the exisitng rule-based classification techniques might not be suitable to model users’ behavior because of several reasons. In the following, we summarize the drawbacks of rule-based classification techniques for discovering the behavioral rules of individual mobile phone users in multi-dimensional contexts.

Low reliability In general, reliability refers the quality of being trustworthy or of performing consistently well. A pattern or rule is called reliable, if the relationship described by the pattern occurs in a high percentage of applicable cases. According to Geng et al. [101], a classification rule will be reliable, if it gives high prediction accuracy, and an association rule will be reliable, if it has high confidence that is associated to the accuracy. However, the classification rules discovered by the typical rule-based classification techniques, e.g., decision trees, mostly have low reliability in many cases [7, 102]. According to Freitas et al. [86], a classification rule may not ensure a high accuracy in predictions. The reason is that it may has over-fitting problem and inductive bias, which decrease the prediction accuracy of a machine learning based model.

Lacking in flexibility Traditional rule-based classification techniques, e.g., decision trees, have no flexibility to set users’ preferences and consequently it makes rigid decision for a particular test case [81]. However, rigid decision in modeling user behavior might not be meaningful by considering real-world cases. The reason is that individuals’ preferences are not static in the real word; may vary from user-to-user [103]. For instance, one individual may want the phone call agent to decline the incoming calls where she did not answer the calls more than, say, 80% of the time in the past. For another person, this preference could be 95% of the time according to her preference. Thus considering flexibility in users' preferences while modeling their behavior could be another issue for making meaningful decisions in various context-aware test cases.

Lacking in generalization Typically, generality measures the comprehensiveness of a pattern or rule, that is, the fraction of all the relevant records in the dataset that matches the pattern. According to Geng et al. [101], if a pattern characterizes more information in the relevant dataset, it tends to be more useful and interesting. Traditional classification techniques consider data-driven generalization while producing classification rules. Besides this, users’ behavior-oriented generalization might be interested for learning context-aware rules. For instance, users’ behavior might be similar for a collection of contexts and have exceptions only in few cases [6]. Thus, users’ behavior oriented generalization could give more precise results for modeling their usage behavior. The generalization not only simplifies the resultant machine learning based model but also minimizes the over-fitting problem and improves the prediction accuracy.

In a summary, by taking into account the reliability, flexibility, and generalization discussed above, typical classification rule learning techniques may not be suitable to produce users’ behavioral rules in multi-dimensional contexts, for the purpose of building intelligent context-aware systems.

Incremental learning and updating

In the area of data mining, a number of updating techniques, known as incremental rule mining, have been proposed for discovering rules in a dynamic database. These techniques use existing discovered rules and the incremental part of the dataset to get a complete updated set of rules. For instance, FUP algorithm proposed by Cheung et al. [104] is the first incremental updating technique for maintaining association rules when new data are inserted to database. The FUP algorithm is based on Apriori [80] algorithm and is used to discover the new frequent itemsets in a dynamic database. In [105], Cheung et al. propose a new algorithm FUP2 which is an extension of FUP algorithm. Another incremental association rule mining algorithm is proposed by Xu et al. [106]. They propose an IFP-tree technique which is an extension of FP-tree [85]. Thomas et al. [107] propose an algorithm based on the concept of negative border which maintains both frequent itemsets and border itemsets.

A few number of algorithms [108, 109] are proposed based on three-way decision that is an extension of the commonly used binary-decision model with an added third option. A theory of three-way decision is constructed based on the notions of acceptance, rejection and no commitment proposed by Yao et al. [110]. In [111], Amornchewin et al. propose a probability-based incremental association rule discovery algorithm. Thusaranon et al. [112] propose another probability-based incremental association rule discovery algorithm that is an extension work of the algorithm introduced by Amornchewin and Kreesuradej [111].

The above incremental mining techniques mainly take into account the faster processing, e.g., efficiency, of overall mining process. While processing, these techniques reduce the scanning on the given datasets by mining the incremental part separately, instead of processing the merged dataset that includes the initial dataset and the incremental part. Thus, the overall mining process of such traditional updating techniques reflects on the processing time to discover a complete set of updated rules. However, to model users’ behavior the freshness of rules, e.g., rules based on recent patterns are significant, which has not been taken into account in these techniques. The reason is that users’ behavior are not static in the real world; may change over time. Thus, the updation in terms of freshness in users’ behavior while producing rules are needed to effectively modeling smartphone users’ behavior in relevant multi-dimensional contexts.

In order to produce rules according to the current behavior of an individual, a number of researchers use the behavioral patterns of recent mobile phone log data to predict the future behavior than the patterns derived from the entire historical logs. However, they use a static period of recent historical data that might not meaningful for discovering users’ recent behavioral rules. For instance, Lee et al. [113] have studied the mobile phone users’ calling patterns and design a call recommendation algorithm for an adaptive speed-call list using a recent call list data. In their approach, they extract call logs for previous three months to achieve their goal. In order to predict the outgoing calls, Barzaiq et al. [114] propose an approach that analyzes mobile phone historical data from a period of two years and observe relatively additional computational load which seems to be unnecessary. Phithakkitnukoon et al. [115], conduct their study on reality mining datasets that were collected over the period of nine months and observe that only a recent portion of communication history is more significant. In another work, Phithakkitnukoon et al. [116] present a model for predicting phone calls for the next twenty four hours based on the users’ past communication history. In their approach, they have shown that the recent trend of the user’s calling pattern is more significant than the order one and has higher correlation to the future pattern than the pattern derived from the entire historical data. As such, the latest sixty days call records in the call logs are assumed to be the future observed call activities in order to get better prediction accuracy [116]. However, such static period of time consideration may not be suitable to reflect one’ current behavior, as users’ behaviors are not consistent in the real world; may vary from user-to-user over time.

Besides these approaches, a number of authors [117, 118] deal with the problem of managing personal information, such as individual’s contact lists in their mobile phone, more specifically, the task of searching the desirable contact number when making an outgoing call. According to Bergman et al. [117], a number of contacts in mobile phones are never actually used albeit the contact lists become increasingly bigger. Their experimental results show that 47% of the contacts of the users had not been used for over six months or had never been used at all. To predict future behavior, Stefanis et al. [118], have used window based model for managing and searching of personal information on mobile phones. In their experiment, they have shown that the training window for predicting individual’s mobile phone usages behavior should be long enough to provide sufficient data. However, at the same time, a training window of more than two weeks would likely fail to capture the dynamic changes in the behavioral patterns for making phone calls. In Addition, a training window of less than seven days would fail to capture the behavioral changes for all the days-of-the-week including a change of social circumstances in the weekends.

In a summary, by taking into account the freshness in rules reflecting users’ current behavior and their dynamic updation, typical updating techniques discussed above may not be suitable to produce a complete set of users’ behavioral rules in multi-dimensional contexts, for the purpose of building intelligent context-aware systems, in order to provide relevant services to the end smartphone users.

Challenges and future directions

With the rapid development of smartphones, IoT, data science and machine learning, and context-aware computing, the most fundamental challenge is to explore contextual data collected from relevant sources and to extract context-aware rules for future actions. We highlight and analyze the main challenges in extracting rules, machine learning techniques, and context-aware system areas, involved in context-aware rule learning. We also discuss about the future directions to overcome such issues. Thus, this section examines the impact of learning context-aware rules on several perspectives discussed in this paper, in the broad area of smartphone data analytics. In the following, the challenges and corresponding future directions are discussed briefly.

In the area of context aware computing, a number of approaches exist in order to handle the continuous contextual features like time-series and to develop time-based context-aware systems. They are mostly designed by taking into account several categorical time periods with a particular interval either equal or unequal, and corresponding temporal rule based system. Although, a static modeling of temporal context is easy to understand and can be useful to analyze population behavior comparing across users, a machine learning based data-driven solution could be an effective way. The reason is that we are now in the age of data science and have available real world contextual datasets in time order due to the rapid growing of IoT and smartphones. Thus, time-series modeling becomes an open problem for building a context-aware system. Although, a few number of learning techniques are employed to create data-driven temporal segments, they can be improved with advanced data analysis like observing variations in temporal patterns, relation with individuals and population behavior, data sparseness in time-series, synchronizing temporal context with multiple data sources etc. Improved machine learning techniques or hybrid methods could give better results for modeling such continuous contextual data. For instance, a dynamic behavior-oriented aggregation algorithm [10], could produce better time-series segmentation results for the purpose of modeling time-based user behavior. New machine learning solutions by considering the above mentioned perspectives can be designed and developed to process and analyze real-world time-series data, in order to build an intelligent time-based context-aware system.

In addition to temporal context, additional dimensions of contexts might have the impact on context-aware system. Although, association analysis and rule-based classification analysis are the well known approaches in machine learning to discover rules, still there are some issues to learn context-aware rules using these techniques. For instance, an association learning technique produces a large amount of redundant rules that makes the context-aware system complex and ineffective. On the otherhand, classification techniques produce rules for rigid decision making that becomes non-reliable in many context-aware cases. Thus, effectively learning rules based on multi-dimensional contexts becomes another challenge. Although, both the classification and association analysis are well established methods in the area of machine learning, improved machine learning techniques or hybrid methods could give better results for learning effective rules based on multi-dimensional contexts. For instance, the problem of redundancy while generating the association rules can be minimized by taking into account the precedence of contexts [119]. Thus, advanced functionality and their combinations in machine learning, like the precedence of contexts, optimum contextual feature selection, users’ preference-oriented discovery, generalization, abnormality or exceptional discovery etc. could produce more effective rules. New machine learning based solutions or potential hybrid methods by considering these functionalities can be designed and developed to process and analyze real-world contextual data, in order to build a rule-based intelligent context-aware system that behaves accordingly.

In recent days, rule-based context-aware systems become popular due to the rapid growing of IoT and smartphones. Some of them are static rule based system particularly designed and developed according to the current needs. A number of such rule based context-aware systems are designed by taking into account rules discovered from data using association learning or classification learning techniques. Although, these rule-based systems are capable to provide the relevant services, still there is a lack of system effectiveness in terms of prediction accuracy in a human-centric system. In that case, a machine learning based data-driven solution by taking into account incremental data and corresponding learning could be an effective way. The reason is that human behavior changes over time and the most recent pattern is likely to be more significant than older ones, which can be found from incremental data. Thus, recent patterns based modeling becomes another challenge for building a context-aware system. Although, a number of updating techniques are employed in the area of incremental data mining, they can be improved by taking into account the freshness in behavioral analytics for a particular context. For instance, a very recent work RecencyMiner [120], could produce better prediction results by taking into account recency-based updation for the purpose of modeling user behavior. Thus, new machine learning technique or hybrid learning based solutions by considering advanced functionalities like analyzing dynamic log, behavioral patterns changing, context-aware incremental learning, freshness in rules, can be designed and developed to build a human-centric intelligent context-aware system that takes into account their recent activities.

The most important work for intelligent context-aware system is to develop an effective framework that supports for learning context-aware rules. Thus, in such a framework, we need to consider advanced data analysis based on contexts using machine learning techniques, so that the rule learning framework is capble to resolve these issues. Thus, a well designed context-aware rule learning framework for contextual smartphone data and the experimental evaluation is a very important direction and a big challenge as well. In a summary, this paper has uncovered several future directions in the field of smartphone data analytics and context-aware rule learning. First, additional study must be performed on the characteristics of smartphone data in terms of associated and relevant contexts, as the context-aware rules depend on surrounding different contexts. Second, the scalability and efficacy of existing analytics techniques being applied to smartphone data must be empirically examined. Third, new techniques and algorithms or potential hybrid methods are needed to be designed while learning context-aware rules, particularly, in terms of time-series modeling, effective rule discovery based on multi-dimensional contexts, and recency-based incremental learning for intelligent decision making utilizing enormous amounts of smartphone data. Fourth, a range of empirical evaluation is necessary to measure the effectiveness and efficiency of these machine learning techniques while comparing with existing techniques. Fifth, more work is necessary on how to efficiently model context-aware rules in relevant application areas for the purpose of building intelligent context-aware applications.

Suggested machine learning based framework

According to the survey of contextual smartphone data and corresponding rule learning strategies, in this section, we suggest a context-aware rule learning framework based on machine learning techniques. Figure 2 shows an overview of the suggested context-aware rule learning framework highlighting various components starting from the bottom raw contextual data to real world applications and services. The framework typically consists of four processing layers such as contextual data acquisition layer, context discretization layer, rule discovery layer, and finally dynamic updating and management layer, shown in Fig. 2. In the following, we briefly discuss about these layers and their roles in learning context-aware rules from smartphone data. These are:

Contextual data acquisition This represents the first layer of our context-aware rule learning framework as collecting relevant data is the first step to build a data-driven system. Thus, this layer is responsible to collect individual’s smartphone data that includes their daily life activities with their phones and corresponding associated contextual information such as temporal context, spatial context, social context or others relevant to the particular usage. Such contextual data can be collected from various sources like smartphone logs, sensors or external sources relevant to the application. Smartphone data collected from these sources usually contains raw contexts that characterize individuals’ daily life behavioral activities with their phones, and need to process effectively to use as the basis for learning context-aware rules.

Context discretization Machine learning based context discretization represents the second layer in our context-aware rule learning framework. Once we have available contextual raw data collected from the data acquisition layer, discretization of contexts is needed to understand the actual meaning of data, which is also known as contextual data clustering, highlighted in Fig. 2. In other words, contextual data with similar characteristics are grouped in one cluster and dissimilar characteristics are grouped in another cluster. For instance, real-world smartphone data contains continuous raw contextual information like time-series data that represents individual’s diverse activities in different data points in time order. Such particular data points separately may not represent a meaningful behavior of users. A data-driven time-series segmentation using machine learning techniques could give an effective discretization results according to the data patterns available in the source. Thus, the main purpose of this layer is to create contextual data clusters, e.g., segments or contextual groups according to similar data characteristics. The processed data in this layer helps to find the hidden patterns that are used as the basis of learning context-aware rules.

Rule discovery Machine learning based contextual rule discovery represents the third layer in our context-aware rule learning framework, shown in Fig. 2. As different contexts might have different impacts on individuals’ usage behavior in the real world, the precedence analysis of contexts can play a role to discover a set of effective rules, highlighted in Fig. 2. Based on the precedence of contexts, this layer is responsible to generate a set of users’ behavioral rules by taking into account relevant multi-dimensional contexts such as temporal context, spatial context, social context or others relevant. The generated behavioral rules are effective and efficient in terms of reliability that represents higher decision making accuracy, conciseness by taking into account the generalization and non-redundancy, context importance by taking into account the precedence of contexts, and lower training time by considering the computing resources in individuals’ devices. Thus, this layer is responsible to generate a set of contextual rules based on relevant multi-dimensional contexts by taking into account these aspects. After discovering rules, this layer is also responsible to rank these rules according to their relevancy in terms of contexts and rule’s strength.

Dynamic updating and management layer Machine learning based dynamic updating and management of the discovered rules represents the final layer in our context-aware rule learning framework, shown in Fig. 2. As individuals’ usage behavior are not static in the real world, may change over time, the recency analysis and mining, and corresponding rule updation can play a role to dynamically update the discovered rules over time, highlighted in Fig. 2. Based on the recent behavioral patters of individuals’ behavior, this layer outputs a set of users’ behavioral rules by taking into account the relevant contexts. The main benefit of this layer is that it takes into account the most recent pattern that represents the freshness in individuals’ behavior in a particular context, which is likely to be more significant than older ones for predicting their future usage. Thus, this layer is one of the significant layers that is responsible to identify the behavioral changing patterns over time, and to update and manage the rules dynamically according to their changes in behavioral activities.

Overall, our suggested machine learning based framework is responsible to extract a set of effective behavioral rules of individual mobile phone users based on relevant multi-dimensional contexts utilizing their smartphone data. The extracted context-aware rules can be used to build various rule-based intelligent systems, in order to not only provide them the target personalized services that may vary from user to user but also the population services in the relevant application areas.

Context-aware rule based applications

A context-aware rule based smartphone application represents knowledge in terms of a set of IF-THEN rules, (i.e., if contexts then user behavioral activities or preferences) that tells what to do or what to conclude in different situations [121] and can act as a software agent. According to [98], software agent is a new paradigm for developing software applications in which an agent is capable of performing autonomous actions in a certain environment to achieve it’s goal. The target applications of this research are those context-aware personalized applications that have been studied widely in the past few years. For instance, intelligent mobile interruptions management system in one of them. The most popular IoT device, smartphones, are considered to be ‘always on, always connected’ device and they are always with their users; however the users are not always able to response with the incoming communications because of their various day-to-day situations [122]. For this reason, sometimes people are often interrupted by incoming phone calls in a working environment [123]. According to the Basex BusinessEdge report [124], the mobile interruptions consume 28% of the knowledge worker’s day. It leads to a loss of $700 billion according to Bureau of Labor Statistics [125]. In order to manage such interruptions, a number of authors [24, 65,66,67, 126] have studied on static rule based systems. However, the machine learning based context-aware rules can be used to make such system automated and intelligent.

Smartphone apps management could be another useful applications for individual users. According to the statistics in Google search, in March 2017, there were 2.8 million apps available at Google Play Store, and 2.2 million apps in the Apple’s App Store. Thus, its very important to manage such kind of huge amount of available applications. Machine learning based context-aware rules can be used to manage these apps according to individual’s preferences. Besides apps management, several notifications from different apps are potentially annoying to the users and causes disruptions [7, 127, 128]. The reason is that the users might get irritated for such uninterested phone notifications [129, 130]. Thus, machine learning based context-aware rules can also be applicable to manage such notifications intelligently.

Smartphone recommendation system is one of the most important needs for the users. According to [131], the most important feature of a recommender system is its ability to “guess” a user’s preferences and interests according to their past usage. In general, the traditional recommender systems mainly focus on recommending the most relevant items to users among a huge number of items without considering contextual impact [57, 132]. However, the contextual information is needed to recommend according to users’ needs [133,134,135]. For instance, a travel recommender system in the summer can be well different from the winter for a particular user, which depends on the temporal context. Similarly another context, e.g., location information, might have the influence to make different recommendations for the users [136, 137]. Such contextual recommendation is important for tourists who travel outside their usual environment [138]. As traditional information centers for tourists are not accessible any time anywhere [139], a context-aware web service based tourism information system is needed [140]. Thus, machine learning based context-aware rules can be used to make such recommendation system more intelligent.

In general, predictive modeling based applications are most important in the area of machine learning and smartphone data analytics [141]. Thus, the machine learning based context-aware rules could be more effective to predict user preferences, such as predicting calls [116, 118, 142, 143], predicting apps usage [35, 37, 144, 145], predicting smartphone notification [129, 130] etc. The context-aware rules can also play a role to build a self smartphone configuration management system that includes device volume settings, WiFi turn on or off, GPS turn on or off etc. that reduce battery consumption. Overall, the machine learning based context-aware rules extracted from smartphone data can help application developers to build the target intelligent system. In addition to the personalized services discussed above, such context-aware rules in different surrounding contexts could be applicable to other broad application areas, like IoT services, eHealth services, transportation services, city governance, industry, e-commerce, context-aware cybersecurity intelligence, and smart city services according to the associated contexts for the purpose of providing intelligent services in relevant context-aware applications.

Conclusion

This paper has discussed how multi-dimensional contexts can impact smartphone data, both in terms of context-aware rule learning and the dataset itself. Our aim was to discuss the state of the art with respect to smartphone data analytics and corresponding rule learning techniques. We also discussed how multi-dimensional contexts such as temporal, spatial or social contexts can impact such techniques, and examine the challenges that remain. For each common technique, we have summarized relevant research to aid others in this context-aware rule learning community when developing their own approaches for various purposes. We have discussed various issues surrounding the context-aware rule learning in terms of time-series modeling, rule discovery based on multi-dimensional contexts, and updating the rules over time according to individual’s recent behavioral patterns. In terms of existing research, much focus has been provided on traditional context-aware systems and techniques, with less available work in machine learning rule based context-aware systems for effective decision making in a particular domain. Overall, in this paper we have surveyed previous work and have presented a discussion of challenges and future directions for effectively learning context-aware rules from smartphone data. The domain specific context-aware rules can be used to build various context-aware systems for the end users to intelligently assist themselves in their day-to-day activities. We do believe that our study on smartphone data analytics and machine learning based framework opens a promising path for future research on mining context-aware rules. We also do believe that this study could be used as a reference guide for both the academia and industry in the relevant application areas.

Availability of data and materials

Not applicable.

References

El Khaddar MA, Boulmalf M. Smartphone: the ultimate iot and ioe device. In: Smartphones from an applied research perspective, 2017, p. 137.
Zheng P, Ni LM. Spotlight: the rise of the smart phone. IEEE Distrib Syst Online. 2006;7(3):3.
Article Google Scholar
International telecommunication union. Measuring the information society. Technical report; 2015. http://www.itu.int/en/itu-d/statistics/documents/publications/misr2015/misr2015-w5.pdf.
Google trends. In: https://trends.google.com/trends/. 2019.
Sarker IH. Mobile data science: towards understanding data-driven intelligent mobile applications. EAI endorsed transactions on scalable information systems, EAI; 2018.
Sarker IH. Behavminer: mining user behaviors from mobile phone data for personalized services. In: Proceedings of the 2018 IEEE international conference on pervasive computing and communications (PerCom 2018), Athens, Greece: IEEE; 2018.
Mehrotra A, et al. M. Prefminer: mining user’s preferences for intelligent mobile notification management. In: UbiComp, ACM; 2016.
Srinivasan VEA. Mobileminer: mining your frequent patterns on your phone. In: UbiComp, ACM; 2014. p. 389–400.
Zhu HEA. Mining mobile user preferences for personalized context-aware recommendation. ACM Trans Intell Syst Technol. 2014;5(4):58.
Article Google Scholar
Sarker IH, Colman A, Kabir MA, Han J. Individualized time-series segmentation for mining mobile phone user behavior. Comput J. 2017;61:349–68.
Article Google Scholar
Dourish P. What we talk about when we talk about context. Pers Ubiquitous Comput. 2004;8(1):19–30.
Article Google Scholar
Schilit BN, Theimer MM. Disseminating active map information to mobile hosts. IEEE Netw. 1994;8(5):22–32.
Article Google Scholar
Brown PJ, Bovey JD, Chen X. Context-aware applications: from the laboratory to the marketplace. IEEE Pers Commun. 1997;4(5):58–64.
Article Google Scholar
Ryan N, Pascoe J, Morse D. Enhanced reality fieldwork: the context aware archaeological assistant. Bar Int Ser. 1999;750:269–74.
Google Scholar
Brown PJ. The stick-e document: a framework for creating context-aware applications. Electron Publ. 1995;8:259–72.
MathSciNet Google Scholar
Franklin D, Flaschbart J. All gadget and no representation makes jack a dull environment. In: Proceedings of the AAAI 1998 spring symposium on intelligent environments, 1998. p. 155–60.
Ward A, Jones A, Hopper A. A new location technique for the active office. IEEE Pers Commun. 1997;4(5):42–7.
Article Google Scholar
Hull R, Neaves P, Bedford-Roberts J. Towards situated computing. In: Wearable computers, 1997. Digest of papers., first international symposium On, IEEE; 1997. p. 146–53.
Rodden T, Cheverst K, Davies K, Dix A. Exploiting context in hci design for mobile systems. In: Workshop on Human Computer Interaction with Mobile Devices, Glasgow; 1998. p. 21–2.
Schilit B, Adams N, Want R. Context-aware computing applications. In: Mobile computing systems and applications, 1994. WMCSA 1994. First workshop On, IEEE; 1994. p. 85–90.
Dey AK. Understanding and using context. Pers Ubiquitous Comput. 2001;5(1):4–7.
Article Google Scholar
Cao L. Data science: a comprehensive overview. ACM Comput Surv. 2017;50(3):43.
Article Google Scholar
Bell S, McDiarmid A. Nodobo: mobile phone as a software sensor for social network research. In: Vehicular technology conference IEEE; 2011.
Pielot M. Large-scale evaluation of call-availability prediction. In: Proceedings of the international joint conference on pervasive and ubiquitous computing, ACM; 2014. p. 933–37
Eagle N, Pentland AS. Reality mining: sensing complex social systems. Pers Ubiquitous Comput. 2006;10(4):255–68.
Article Google Scholar
Sarker IH, Colman A, Kabir MA, Han J. Behavior-oriented time segmentation for mining individualized rules of mobile phone users. In: Data science and advanced analytics (DSAA), 2016 IEEE international conference On, IEEE; 2016. p. 488–97.
Sarker IH, Kabir MA, Colman A, Han J. Designing architecture of a rule-based system for managing phone call interruptions. In: Proceedings of the 2017 ACM international joint conference on pervasive and ubiquitous computing and proceedings of the 2017 ACM international symposium on wearable computers, ACM; 2017. p. 898–903.
Sarker IH, Kabir MA, Colman A, Han J. Evidence-based behavioral model for calendar schedules of individual mobile phone users. In: Data science and advanced analytics (DSAA), 2016 IEEE international conference On, IEEE; 2016. p. 584–93.
Sarker IH. Understanding the role of data-centric social context in personalized mobile applications. EAI endorsed transactions on context-aware systems and applications. EAI; 2018.
Sarker IH, Colman A, Kabir MA, Han J. Phone call log as a context source to modeling individual user behavior. In: Proceedings of the 2016 ACM international joint conference on pervasive and ubiquitous computing: adjunct, ACM; 2016. pp. 630–4.
Union IT. Itu internet report. 2006.
Almeida TA, Hidalgo JMG, Yamakami A. Contributions to the study of sms spam filtering: new collection and results. In: Proceedings of the 11th ACM symposium on document engineering, ACM; 2011. p. 259–62.
Fischer JE, Yee N, Bellotti V, Good N, Benford S, Greenhalgh C. Effects of content and time of delivery on receptivity to mobile interruptions. In: Proceedings of the 12th international conference on human computer interaction with mobile devices and services, ACM; 2010. p. 103–12.
Sarker IH, Salah K. Appspred: predicting context-aware smartphone apps using random forest learning. Internet of Things; 2019.
Kim J, Mielikäinen T. Conditional log-linear models for mobile application usage prediction. In: Machine learning and knowledge discovery in databases. p. 672–87. Berlin: Springer; 2014.
Liao Z-X, Pan Y-C, Peng W-C, Lei P-R. On mining mobile apps usage behavior for predicting apps usage in smartphones. In: Proceedings of the 22nd international conference on information & knowledge management, ACM; 2013. p. 609–18.
Zhu H, Chen E, Xiong H, Cao H, Tian J. Mobile app classification with enriched contextual information. IEEE Trans Mob Comput. 2014;13(7):1550–63.
Article Google Scholar
Halvey Mea. Time-based segmentation of log data for user navigation prediction in personalization. In: Web intelligence., IEEE; 2005. p. 636–40.
Halvey M, Keane MT, Smyth B. Time based patterns in mobile-internet surfing. In: Proceedings of the SIGCHI conference on human factors in computing systems, ACM; 2006. p. 31–4.
Bordino I, Donato D. Extracting interesting association rules from toolbar data. In: International conference on information and knowledge management, ACM; 2012.
Paireekreng W, Rapeepisarn K, Wong KW. Time-based personalised mobile game downloading. In: Transactions on edutainment II. p. 59–69, 2009.
Rawassizadeh R, Tomitsch M, Wac K, Tjoa AM. Ubiqlog: a generic mobile phone-based life-log framework. Pers Ubiquitous Comput. 2013;17(4):621–37.
Article Google Scholar
Gandhi S, Oates T, Boedihardjo A, Chen C, Lin J, Senin P, Frankenstein S, Wang X. A generative model for time series discretization based on multiple normal distributions. In: Proceedings of the 8th workshop on Ph. D. workshop in information and knowledge management, ACM; 2015. p. 19–25.
Farrahi K, Gatica-Perez D. A probabilistic approach to mining mobile phone data sequences. Pers Ubiquitous Comput. 2014;18(1):223–38.
Article Google Scholar
Zhang G, Liu X, Yang Y. Time-series pattern based effective noise generation for privacy protection on cloud. IEEE Trans Comput. 2015;64(5):1456–69.
Article MathSciNet Google Scholar
Song Y, Ma H, Wang H, Wang K. Exploring and exploiting user search behavior on mobile and tablet devices to improve search relevance. In: Proceedings of the 22nd international conference on world wide web, international world wide web conferences steering committee; 2013. p. 1201–12.
Rawassizadeh R, Momeni E, Dobbins C, Gharibshah J, Pazzani M. Scalable daily human behavioral pattern mining from multivariate temporal data. IEEE Trans Knowl Data Eng. 2016;28(11):3098–112.
Article Google Scholar
Mukherji A, et al. M. Adding intelligence to your mobile device via on-device sequential pattern mining. In: UbiComp : Adjunct.
Bayir MA, Demirbas M, Cosar A. A web-based personalized mobility service for smartphone applications. Comput J. 2010;54(5):800–14.
Article Google Scholar
Jayarajah K. Kauffman R. Misra A. Exploring variety seeking behavior in mobile users. In: Proceedings of the international joint conference on pervasive and ubiquitous computing, Seattle, WA, USA, 13–17 September, p. 385–90. ACM, New York, USA; 2014.
Do T-M-T, Gatica-Perez D. By their apps you shall understand them: mining large-scale patterns of mobile phone usage. In: Proceedings of the international conference on mobile and ubiquitous multimedia, Limassol, Cyprus, 1–3 December, 27. ACM, New York, USA; 2010.
Xu Y, Lin M, Lu H, Cardone G, Lane N, Chen Z, Campbell A, Choudhury T. Preference, context and communities: a multi-faceted approach to predicting smartphone app usage patterns. In: Proceedings of the international symposium on wearable computers, Zurich, Switzerland, 8–12 September, p. 69–76. ACM, New York, USA; 2013.
Oulasvirta A, Rattenbury T, Ma L, Raita E. Habits make smartphone use more pervasive. Pers Ubiquitous Comput. 2012;16(1):105–14.
Article Google Scholar
Yu K, Zhang B, Zhu H, Cao H, Tian J. Towards personalized context-aware recommendation by mining context logs through topic models. In: Proceedings of the Pacific-Asia conference on knowledge discovery and data mining, Kuala Lumpur, Malaysia, May 29–June 01, p. 431–43. Springer-Verlag Berlin, Heidelberg; 2012.
Naboulsi D, Stanica R, Fiore M. Classifying call profiles in large-scale mobile traffic datasets. In: INFOCOM, 2014 Proceedings IEEE, IEEE; 2014; p. 1806–14.
Dashdorj Z, Serafini L. Semantic enrichment of mobile phone data records. In: International conference on mobile and ubiquitous multimedia. ACM; 2013.
Shin D, Lee J-w, Yeon J, Lee S-g. Context-aware recommendation by aggregating user context. In: Commerce and enterprise computing, 2009. CEC’09. IEEE conference On, IEEE; 2009. p. 423–30.
Farrahi K, Gatica-Perez D. Probabilistic mining of socio-geographic routines from mobile phone data. IEEE J Sel Top Signal Process. 2010;4(4):746–55.
Article Google Scholar
Sarker IH, Colman A, Han J, Kayes A, Watters P. Calbehav: A machine learning based personalized calendar behavioral model using time-series smartphone data. Comput J 2019;1–16.
Ozer Mea. Predicting the location and time of mobile phone users by using sequential pattern mining techniques. Comput J. 2016;59(6):908–22.
Article Google Scholar
Do TMT, Gatica-Perez D. Where and what: using smartphones to predict next locations and applications in daily life. Pervasive Mob Comput. 2014;12:79–91.
Article Google Scholar
Farrahi K, Gatica-Perez D. What did you do today?: discovering daily routines from large-scale mobile data. In: Proceedings of the international conference on multimedia, Vancouver, British Columbia, Canada, 26-31 October, p. 849–52. ACM, New York, USA; 2008.
Karatzoglou A, Baltrunas L, Church K, Böhmer M. Climbing the app wall: enabling mobile app discovery through context-aware recommendations. In: Proceedings of the international conference on information and knowledge management, Maui, Hawaii, USA, 29 October–02 November, p. 2527–30. ACM, New York, USA; 2012.
Phithakkitnukoon S, Horanont T. Identifying human daily activity pattern using mobile phone data. In: Human behavior understanding. Berlin: Springer; 2010.
Khalil A, Connelly K. Improving cell phone awareness by using calendar information. In: Human-computer interaction. Berlin: Springer; 2005. p. 588–600.
Dekel A, Nacht D, Kirkpatrick S. Minimizing mobile phone disruption via smart profile management. In: Proceedings of the 11th international conference on human-computer interaction with mobile devices and services, ACM; 2009. p. 43.
Zulkernain S, et al. A mobile intelligent interruption management system. J Univ Comput Sci. 2010;16(15):2060–80.
Google Scholar
Seo S-s, Kwon A, Kang J-M, Strassner J, Hong JW-K. Pyp: design and implementation of a context-aware configuration manager for smartphones. In: SmartApps’ 11; 2011.
Shokoohi-Yekta M, Chen Y, Campana B, Hu B, Zakaria J, Keogh E. Discovery of meaningful rules in time series. In: Proceedings of the ACM SIGKDD international conference on knowledge discovery and data mining, Sydney, NSW, Australia, 10–13 August, p. 1085–94. ACM, New York, USA; 2015.
Hartono RN, Pears R, Kasabov N, Worner SP. Extracting temporal knowledge from time series: a case study in ecological data. In: Proceedings of the international joint conference on neural networks, Beijing, China, 6–11 July, p. 4237–43. IEEE Computer Society, Washington, DC, USA; 2014.
Keogh E, Chu S, Hart D, Pazzani M. Segmenting time series: a survey and novel approach. Data Min Time Ser Databases. 2004;57:1–22.
Article Google Scholar
Das G, Lin K-I, Mannila H, Renganathan G, Smyth P. Rule discovery from time series. KDD. 1998;98:16–22.
Google Scholar
Lu EH-C, Tseng VS, Philip SY. Mining cluster-based temporal mobile sequential patterns in location-based service environments. IEEE Trans Knowl Data Eng. 2011;23(6):914–27.
Article Google Scholar
Kandasamy K, Kumar CS. Modified pso based optimal time interval identification for predicting mobile user behaviour in location based services. Indian J Sci Technol. 2015;8(S7):185–93.
Article Google Scholar
Xu R, Wunsch D. Survey of clustering algorithms. IEEE Trans Neural Netw. 2005;16(3):645–78.
Article Google Scholar
MacQueen J. Some methods for classification and analysis of multivariate observations. In: Fifth Berkeley symposium on mathematical statistics and probability, vol. 1; 1967.
Rokach L. A survey of clustering algorithms. In: Data mining and knowledge discovery handbook. Berlin: Springer; 2010. p. 269–98.
Sneath PH. The application of computers to taxonomy. J Gen Microbiol. 1957;17(1):201–26.
Article Google Scholar
Sorensen T. Method of establishing groups of equal amplitude in plant sociology based on similarity of species. Biol Skr. 1948;5:1–34.
Google Scholar
Agrawal R, et al. A. Fast algorithms for mining association rules. In: VLDB; 1994, vol. 1215, p. 487–99.
Quinlan JR. C4.5: Programs for machine learning. Machine learning; 1993.
Flach PA, Lachiche N. Confirmation-guided discovery of first-order rules with tertius. Mach Learn. 2001;42(1–2):61–95.
Article MATH Google Scholar
Houtsma M, Swami A. Set-oriented mining for association rules in relational databases. In: Data engineering, 1995. Proceedings of the eleventh international conference On, IEEE; 1995. p. 25–33.
Ma BLWHY. Integrating classification and association rule mining. In: Proceedings of the fourth international conference on knowledge discovery and data mining; 1998.
Han J, Pei J, Yin Y. Mining frequent patterns without candidate generation. In: ACM Sigmod Record, vol. 29, ACM; 2000. p. 1–12.
Freitas AA. Understanding the crucial differences between classification and discovery of association rules: a position paper. ACM SIGKDD Explor Newslett. 2000;2(1):65–9.
Article Google Scholar
Fournier-Viger P et al. Mining top-k non-redundant association rules. In: Methodologies for intelligent systems. Berlin: Springer; 2012.
Bouker S, Saidi R, Yahia SB, Nguifo EM. Ranking and selecting association rules based on dominance relationship. In: Tools with artificial intelligence (ICTAI), 2012 IEEE 24th international conference On, vol. 1, IEEE; 2012. p. 658–65.
Witten IH, Frank E. Data mining: practical machine learning tools and techniques. Burlington: Morgan Kaufmann; 2005.
MATH Google Scholar
Holte RC. Very simple classification rules perform well on most commonly used datasets. Mach Learn. 1993;11(1):63–90.
Article MATH Google Scholar
Witten IH, Frank E, Trigg LE, Hall MA, Holmes G, Cunningham SJ. Weka: practical machine learning tools and techniques with java implementations. 1999.
Frank E, Witten IH. Generating accurate rule sets without global optimization. 1998.
Sheng S, Ling CX. Hybrid cost-sensitive decision tree, knowledge discovery in databases. In: PKDD 2005, Proceedings of 9th European conference on principles and practice of knowledge discovery in databases. Lecture notes in computer science, vol. 3721, 2005.
Quinlan JR. Induction of decision trees. Mach Learn. 1986;1(1):81–106.
Google Scholar
Wu X, Kumar V, Quinlan JR, Ghosh J, Yang Q, Motoda H, McLachlan GJ, Ng A, Liu B, Philip SY, et al. Top 10 algorithms in data mining. Knowl Inf Syst. 2008;14(1):1–37.
Article Google Scholar
Wu C-C, Chen Y-L, Liu Y-H, Yang X-Y. Decision tree induction with a constrained number of leaf nodes. Appl Intell. 2016;45(3):673–85.
Article Google Scholar
Hong J, Suh E-H, Kim J, Kim S. Context-aware system for proactive personalized service based on context history. Expert Syst Appl. 2009;36(4):7448–57.
Article Google Scholar
Lee W-P. Deploying personalized mobile services in an agent-based environment. Expert Syst Appl. 2007;32(4):1194–207.
Article Google Scholar
Sarker IH. A machine learning based robust prediction model for real-life mobile phone data. Internet Things. 2019;5:180–93.
Article Google Scholar
Sarker IH, Kabir MA, Colman A, Han J. An improved naive bayes classifier-based noise detection technique for classifying user phone call behavior; 2017.
Geng L, Hamilton HJ. Interestingness measures for data mining: a survey. ACM Comput Surv. 2006;38(3):9.
Article Google Scholar
Ordonez C. Comparing association rules and decision trees for disease prediction. In: Proceedings of the international workshop on healthcare information and knowledge management, ACM; 2006. p. 17–24.
Sarker IH. Research issues in mining user behavioral rules for context-aware intelligent mobile applications. Iran J Comput Sci. 2019;2(1):41–51.
Article MathSciNet Google Scholar
Cheung DW, Han J, Ng VT, Wong C. Maintenance of discovered association rules in large databases: an incremental updating technique. In: Data engineering, 1996. Proceedings of the twelfth international conference On, IEEE; 1996. p. 106–14.
Cheung DW-L, Lee SD, Kao B, et al. A general incremental technique for maintaining discovered association rules. In: DASFAA, vol. 6, 1997. p. 185–94.
Xu B, Yi T, Wu F, Chen Z. An incremental updating algorithm for mining association rules. J Electron. 2002;19(4):403–7.
Google Scholar
Thomas S, Bodagala S, Alsabti K, Ranka S. An efficient algorithm for the incremental updation of association rules in large databases. In: KDD, 1997. p. 263–6.
Zhang Z, Li Y, Chen W, Min F. A three-way decision approach to incremental frequent itemsets mining. J Inform Comput Sci. 2014;11(10):3399–410.
Article Google Scholar
Li Y, Zhang Z-H, Chen W-B, Min F. Tdup: an approach to incremental mining of frequent itemsets with three-way-decision pattern updating. Int J Mach Learn Cybern. 2015;8:441–53.
Article Google Scholar
Yao Y. An outline of a theory of three-way decisions. In: International conference on rough sets and current trends in computing, Springer; 2012. p. 1–17.
Amornchewin R, Kreesuradej W. Mining dynamic databases using probability-based incremental association rule discovery algorithm. J Univ Comput Sci. 2009;15(12):2409–28.
Google Scholar
Thusaranon P, Kreesuradej W. A probability-based incremental association rule discovery algorithm for record insertion and deletion. Artif Life Robot. 2015;20(2):115–23.
Article Google Scholar
Lee Sea. An adaptive speed-call list algorithm and its evaluation with esm. In: Human factors in computing systems. ACM; 2010.
Barzaiq OO, Loke SW. Adapting the mobile phone for task efficiency: the case of predicting outgoing calls using frequency and regularity of historical calls. Pers Ubiquitous Comput. 2011;15(8):857–70.
Article Google Scholar
Phithakkitnukoon S, Dantu R. Adequacy of data for characterizing caller behavior. In: Proceedings of KDD inter. Workshop on social network mining and analysis (SNAKDD 2008). Citeseer; 2008.
Phithakkitnukoon SEA. Behavior-based adaptive call predictor. ACM Trans Auton Adapt Syst. 2011;6(3):21.
Article Google Scholar
Bergman O, Komninos A, Liarokapis D, Clarke J. You never call: demoting unused contacts on mobile phones using dmtr. Pers Ubiquitous Comput. 2012;16(6):757–66.
Article Google Scholar
Stefanis V, Plessas A, Komninos A, Garofalakis J. Frequency and recency context for the management and retrieval of personal information on mobile devices. Pervasive Mob Comput. 2014;15:100–12.
Article Google Scholar
Sarker IH, Salim FD. Mining user behavioral rules from smartphone data through association analysis. In: Proceedings of the 22nd Pacific-Asia conference on knowledge discovery and data mining (PAKDD), Melbourne, Australia, Springer; 2018. p. 450–61.
Sarker IH, Colman A, Han J. Recencyminer: mining recency-based personalized behavior from contextual smartphone data. J Big Data. 2019;6(1):49.
Article Google Scholar
Grosan C, Abraham A. Rule-based expert systems. In: Intelligent systems. Berlin: Springer; 2011; p. 149–85.
Chang Y-J, Tang JC. Investigating mobile users’ ringer mode usage and attentiveness and responsiveness to communication. In: Proceedings of the international conference on human-computer interaction with mobile devices and services, Copenhagen, Denmark, 24–27 August, p. 6–15. ACM, New York, USA; 2015.
Pejovic Vea. Interruptme: designing intelligent prompting mechanisms for pervasive applications. In: UbiComp, ACM; 2014. p. 897–908.
Spira JB, Feintuch JB. The cost of not paying attention: how interruptions impact knowledge worker productivity. Report from Basex; 2005.
Bureau of labor statistics. http://www.bls.gov.
Kabir MAEA. User-centric social context information management: an ontology-based approach and platform. Pers Ubiquitous Comput. 2014;18(5):1061–83.
Article Google Scholar
Sahami Shirazi A, Henze N, Dingler T, Pielot M, Weber D, Schmidt A. Large-scale assessment of mobile notifications. In: Proceedings of the SIGCHI conference on human factors in computing systems, ACM; 2014. p. 3055–64.
Iqbal ST, Horvitz E. Notifications and awareness: a field study of alert usage and preferences. In: Proceedings of the 2010 ACM conference on computer supported cooperative work, ACM; 2010. p. 27–30.
Kanjo E, Kuss DJ, Ang CS. Notimind: utilizing responses to smart phone notifications as affective sensors. IEEE Access. 2017;5:22023–35.
Article Google Scholar
Turner LD, Allen SM, Whitaker RM. Push or delay? decomposing smartphone notification response behaviour. In: Human behavior understanding, Berlin: Springer; 2015. p. 69–83.
Lu J, Wu D, Mao M, Wang W, Zhang G. Recommender system application developments: a survey. Decis Support Syst. 2015;74:12–32.
Article Google Scholar
Bobadilla J, Ortega F, Hernando A, Gutiérrez A. Recommender systems survey. Knowl Based Syst. 2013;46:109–32.
Article Google Scholar
Kim K-j, Ahn H, Jeong S. Context-aware recommender systems using data mining techniques. In: Proceedings of world academy of science, engineering and technology, vol. 64, 2010. p. 357–62.
Liu Q, Ge Y, Li Z, Chen E, Xiong H. Personalized travel package recommendation. In: Data mining (ICDM), 2011 IEEE 11th international conference On, IEEE; 2011. p. 407–16.
Ge Y, Liu Q. Xiong H, Tuzhilin A, Chen J. Cost-aware travel tour recommendation. In: Proceedings of the 17th ACM SIGKDD international conference on knowledge discovery and data mining, ACM; 2011. p. 983–91.
Park M-H, Hong J-H, Cho S-B. Location-based recommendation system using bayesian user’s preference model in mobile devices. In: International conference on ubiquitous intelligence and computing, Springer; 2007. p. 1130–9.
Zheng VW, Cao B, Zheng Y, Xie X, Yang Q. Collaborative filtering meets mobile recommendation: a user-centered approach. In: AAAI, vol. 10, 2010. p. 236–41.
World tourism organization, unwto. http://www2.unwto.org/.
Abbaspour R, Samadzadegan F. Building a context-aware mobile tourist guide system base on a service oriented architecture. Int Arch Photogramm Remote Sens Spatial Inform Sci. 2008;37:871–4.
Google Scholar
Pashtan A, Blattler R, Andi AH, Scheuermann P. Catis: a context-aware tourist information system, 2003.
Williams P, Soares C, Gilbert JE. A clustering rule-based approach to predictive modeling. In: Proceedings of the 48th annual Southeast regional conference, ACM; 2010. p. 45.
Plessas A, Stefanis V, Komninos A, Garofalakis J. Field evaluation of context aware adaptive interfaces for efficient mobile contact retrieval. Pervasive Mob Comput. 2017;35:51–64.
Article Google Scholar
Phithakkitnukoon S, Dantu R. Towards ubiquitous computing with call prediction. ACM SIGMOB Mob Comput Commun Rev. 2011;15(1):52–64.
Article Google Scholar
Baeza-Yates R, Jiang D, Silvestri F, Harrison B. Predicting the next app that you are going to use. In: Proceedings of the 8th ACM international conference on web search and data mining, ACM; 2015. p. 285–94.
Zhu H, Cao H, Chen E, Xiong H, Tian J. Exploiting enriched contextual information for mobile app classification. In: Proceedings of the 21st ACM international conference on information and knowledge management, ACM; 2012. p. 1617–21.

Download references

Acknowlegements

The author would like to thank all the reviewers for their rigorous review and comments. The reviews are detailed and helpful to finalize the manuscript. The author is highly grateful to them. The author also would like to thank the administration of Swinburne University of Technology, Melbourne, Australia, for providing the required facilities to do research work and survey in their post-graduate research lab. Finally, the author would like to thank Prof. Jun Han and Dr. Alan Colman, Swinburne University of Technology, Australia, for their helpful guidance.

Funding

Not applicable.

Author information

Authors and Affiliations

Swinburne University of Technology, Melbourne, VIC-3122, Australia
Iqbal H. Sarker
Chittagong University of Engineering and Technology, Chittagong, Bangladesh
Iqbal H. Sarker

Authors

Iqbal H. Sarker
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

This article reviews previous work in the area of smartphone data analytics and presents a discussion of open challenges and future directions for context-aware rule learning from smartphone data. The first and corresponding author IHS carried out the conception, survey, analysis and design, and prepare this manuscript as well. The author read and approved the final manuscript.

Corresponding author

Correspondence to Iqbal H. Sarker.

Ethics declarations

Competing interests

The author declare that he has no competing interests.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

Reprints and permissions

About this article

Cite this article

Sarker, I.H. Context-aware rule learning from smartphone data: survey, challenges and future directions. J Big Data 6, 95 (2019). https://doi.org/10.1186/s40537-019-0258-4

Download citation

Received: 12 July 2019
Accepted: 10 October 2019
Published: 31 October 2019
DOI: https://doi.org/10.1186/s40537-019-0258-4

Context-aware rule learning from smartphone data: survey, challenges and future directions

Abstract

Introduction

Background: contexts and smartphone data

Characteristics of contexts

Contextual smartphone data

Context-aware rule learning strategies

Modeling time-series smartphone data

Static segmentation

Dynamic segmentation

Rule discovery

Association rules

Classification rules

Incremental learning and updating

Challenges and future directions

Suggested machine learning based framework

Context-aware rule based applications

Conclusion

Availability of data and materials

References

Acknowlegements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Additional information

Publisher’s Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords