Measuring regularity of human physical activities with entropy models

Regularity is an important aspect of physical activity that can provide valuable insights into how

question about human behavior understanding.Our goal here is to measure the regularity of human physical activities.The accurate measurement of regularity can facilitate advancements in human activity modeling and prediction [3], and further enable the implementation of tailored interventions aimed at improving health outcomes.
Conventional methods [4,5] measure the regularity of human life by designing diary-like instruments to record the occurrence of particular events.Such data collection requires the subjects to record their activities of daily living manually.Therefore it is expensive and subjective, and also constrained to small sample sizes and short time spans.The proliferation of smartphones and wearable devices provides new opportunities to collect sensor data about human physical activities on a large scale and over a long period of time [6].In particular, wearable devices that are worn all the time provide accurate and continuous monitoring of physical activity in a free-living environment [7].Multiple sensor data, including step counts, calorie expenditures, exercise intensity, heartbeats, and so on, are collected at a fine granularity and form the basis for subsequent analysis.Such longitudinal sensor data can reflect an individual's all physical activities more accurately and informatively [8].This makes it possible to measure the different aspects of human physical activities in detail and reliably.
Regularity refers to the extent to which individual activities repeat over time in fixed patterns [3].It not only depends on the variability in the characteristics of a single type of activity but also on the cycles or patterns existing in activity sequences.Existing metrics utilizing longitudinal sensor data to measure regularity are either based on periodicity or stability.However, both periodicity and stability are special types of regularity.They cannot provide a comprehensive measurement of regularity for human physical activity.In information theory, entropy quantifies the uncertainty of a random variable and it also measures the degree of randomness in the system [9].There are several entropy models that can be used to determine the regularity of serial data based on the presence of patterns.The entropy rate measures how the entropy of a sequence changes over time, allowing the randomness of the sequence to be quantified [10].In practice, the entropy rate has been used extensively to assess the regularity of categorical sequences, including human mobility [3,11], human economic behavior [12], human online life [13], web browsing behavior [14] and patient health records [15].Additionally, Pincus et al. [16] devised Approximate Entropy (ApEn) as the measurement of regularity that can be applied to short and noisy time series of continuous values.Richman et al. [17] developed Sample Entropy (SampEn), which addresses the issues of bias and lack of relative consistency in approximate entropy, to measure the regularity of clinical and experimental time series.While approximate entropy and sample entropy are initially developed for physiological applications, both of them have been used in other fields such as medicine [18][19][20], economics [21,22], climatology [23][24][25], gait analysis [26,27], and battery health prognostics [28,29].In a nutshell, entropy rate is inversely proportional to the sum of the lengths of new subsequences at each position in the whole sequence.Approximate entropy and sample entropy rely on the conditional probability that close patterns of subsequences with length m remain close on next incremental comparisons.All of them take into account the evolution of ordered subsequences, which, in fine granular data, can be regarded as patterns.As a result, entropy models can potentially accommodate more and prior unknown patterns, and thus can provide a more comprehensive measurement of regularity than the metrics based only on periodicity or stability.Despite this, there is, to date, very little research on the application of these entropy models to human physical activity data.
In this paper, we aim to investigate the applicability of entropy models to quantify the regularity of human physical activities from longitudinal sensor data.First, the entropy rate calculation procedure is modified to allow it to be applied to physical activity data with continuous values.Then, we compare the applicability of entropy rate, approximate entropy, and sample entropy as measurements of the regularity of physical activities.By simulating real-life activity patterns with synthesized data, we validate the performance of entropy models under different scenarios.Results indicates that entropy rate is more suitable alternative to approximate entropy and sample entropy.Entropy rate can identify not only the magnitude and amount of randomness, but also the macroscopic variations, such as the differences on duration and occurrence time, which can not be recognized by approximate entropy and sample entropy.We then further evaluate the performance of the three entropy models by correlating their respective entropy values with prediction errors obtained from multiple forecast models, on real-world physical activity data samples.We show that again entropy rate outperforms the other two entropy models with a correlation coefficient as high as 0.895.We then conclude that entropy rate is a reliable measurement of human physical activity regularity.
Utilizing entropy rate as the measurement, we investigate the interpersonal and intrapersonal variation of regularity for 686 individuals.We find that the regularity varies considerably across individuals and the difference in activities composition can explain a large part of the variation.Meanwhile, the majority of individuals maintain stable physical activity habits and their regularity does not change significantly over time.
The contributions of this study are summarized as follows: 1 We modify the calculation procedure of entropy rate so that it can be applied to physical activity data. 2 We propose a framework to validate the performance of entropy models on both synthesized and real-world physical activity data.Experiment results demonstrate that entropy rate is more suitable than approximate entropy and sample entropy for measuring the regularity of human physical activities.3 Our analysis of human physical activity regularity using entropy rate reveals that variations in regularity among individuals are primarily associated with the composition of activities.In addition, the regularity of most individuals remains stable over time.
The rest of the paper is organized as follows.We first present the related work on human behavior regularity in Sect."Related work".Then we describe the details of entropy models in Sect."Measurement of regularity".Section "Validation of entropy models" displays the performance of three entropy models on physical activity data.More results about human physical activity regularity are shown in Sect."Human physical activity regularity", followed by a "Discussions" section, in which we discuss the finding as well as limitations of our study.Finally, in Sect."Conclusions", we conclude our work and discuss future research directions.

Related work
The concept of regularity is recognized as a fundamental aspect in the field of human behavior understanding.Prior to the widespread adoption of wearable technology, a significant body of literature has utilized survey data to assess the regularity in human daily activities.The Social Rhythm Metric (SRM) [4,5] is a widely used metric that quantifies the regularity of an individual's daily activities with respect to their timing.The SRM is calculated by first having subjects record the occurrence of various event categories, and then determining the habit time of each event through an outlier elimination algorithm.
The average count of events that occur within the habit time for each event category is then used to reflect the subject's level of daily lifestyle regularity.Despite its widespread usage in studies exploring the relationship between health outcomes and lifestyle regularity [30][31][32], the SRM ignores the interconnections among successive events, leading to an incomplete representation of regularity.
In recent years, mobile sensing has shown increasing potential for tracking human daily activities of living.The widespread use of smartphones and wearable devices has enabled the collection of rich, longitudinal sensor data that characterizes an individual's activities, sleep patterns, and application usage.These data provides a foundation for studying the regularity of human daily activities and has resulted in the proposal of several metrics.These metrics can be broadly categorized into two groups: those that focus on quantifying the periodicity in the sensor data and those that employ stability to capture the essence of regularity.
Since human behaviors are driven by an internal biological clock that regulates the sleep-wake cycle and repeats roughly every 24 h, the 24-h periodicity is usually regarded as a measurement of human life regularity.Saeb et al. [33] and Wang et al. [6] proposed circadian rhythm to measure the strength with which an individual follows a 24-h rhythm in behaviors.Sensor data was converted to the frequency domain, and the circadian rhythm was determined by energy that falls within the 24 ± 0.5 h.Phillips et al. [34] developed the sleep regularity index (SRI), which calculates the percentage probability of an individual being in the same state (asleep vs. awake) at any two time-points 24 h apart, as the measurement of sleep regularity.A similar metric can be extended to sensor data other than sleep.In reference [6,35], regularity index (RI) and flexible regularity index (FRI) were proposed to assess the difference between the same time points across different days.The RI calculates the product of rescaled values of the same time points from different days to evaluate the difference, while the FRI uses edit distance.It should be noted that these metrics only focus on the similarity of data points at 24-h intervals and do not take into account the orderliness of successive data points.
Other metrics in the literature treat the stability as a proxy for regularity.Standard deviations are commonly used to measure the variance of daily activities.Marschollek et al. [36] employed the standard deviation of all time differences between physical activity event starts as a measure of regularity.Fischer et al. [37] applied the standard deviation to features of daily sleep (e.g.sleep onset, sleep offset, midsleep, duration) to quantify sleep regularity.Wang et al. [6] directly calculate standard deviation of physical activity data as a metric for human physical activity variability.Wil et al. [38] introduced the concepts of inter-daily stability and intra-daily variability to assess rest-activity rhythms.Inter-daily stability evaluates the consistency of daily activity patterns with respect to the average pattern across days, and it reflects the stability of rest-activity rhythms over multiple days.Intra-daily variability is calculated as the ratio of the mean squared the first deviation of the data and the population variance of the data, which detects fragmentation of rest-activity rhythms.In addition to rest-activity rhythms, they can also be applied to measure sleep regularity [37].However, stability is only one aspect of regularity, as periodic signals with large variations can still be considered regular.Therefore, the use of stability alone may not be sufficient to represent regularity.
It is evident that none of the metrics previously discussed are capable of assessing the regularity of human daily activities comprehensively.These metrics are limited in their scope, as they make prior assumptions about the definition of regularity.A comprehensive regularity measurement should consider all possible patterns present in the sensor data.In the following section, we introduce entropy-based metrics that circumvent the limitations of the previously mentioned metrics, offering a more inclusive measurement of regularity.

Measurement of regularity
The antithesis of regularity is randomness.The analysis of the randomness of a series has its roots in information theory and the concept of entropy.Entropy quantifies the amount of information of random variables based on the probability distributions.It can also measures the degree of randomness in the system [10].However, Shannon entropy has a limitation in its inability to capture patterns present in sequential data, as it disregards the temporal correlation among elements in sequence.The real entropy of a sequence depends not only on the frequency of elements in the sequence, but also on the order in which the elements are combined.Entropy rate, approximate entropy, and sample entropy are three kinds of entropy that consider the ordered sub-sequence existed in the sequence.

Entropy rate
Mathematically, a series from real-world can be modeled as a stochastic process X , which is an indexed sequence of random variables [X 1 , X 2 ..., X n ] .And there can be an arbitrary dependence among the random variables.The joint entropy of the collection of random variables is The entropy rate of a stochastic process is the asymptotic rate at which the entropy of a sequence grows with increasing n.The entropy rate H(X) is defined as follows.
As shown in Eq. ( 2), entropy rate is the average entropy over all random variables and it means the average information gain with the increment of the sequence.Reference [9] proves that this limit of Eq. ( 2) exists for all stationary random processes and is equal to where H (X n |X 1 , ..., X n−1 ) is the conditional entropy of the last variable given the previ- ous n − 1 values.Equation (3) indicates that the entropy rate account for the depend- encies among random variables.The stronger dependencies among variables in a stochastic process, the more information the previous variables provide about the next one, and therefore the lower the entropy rate of the process.In contrast, if all variables of the process are independent, the entropy rate of the process is exactly equivalent to the Shannon entropy of the process, which is the upper bound for the entropy rate.
The estimation of entropy rate can be challenging since it is difficult to know the joint probability distribution of finite sequences in real-world data.Here, we introduce an estimation algorithm based on Lempel-Ziv data compression [39], which is known to rapidly converge to the real entropy rate of a time series.For a time series with length n, the entropy rate is estimated by where i is the length of the shortest substring starting at position i which doesn't previously appear from position 1 to i − 1 .We illustrate the estimation procedure of entropy rate of a discrete sequence X = (a, b, a, b, c) in Table 1 for a better understanding.The notation X[1 : i] is historical subsequence before position i. S i is the shortest subse- quence that never appeared in X[1 : i] , and i is the length of the shortest subsequence S i .For i = 1, 2, 5 , i = 1 , since the symbols in these positions are new symbols.While, for i = 3 , the shortest new subsequence is abc, because the historical subsequence ab appears in position 3 again.A similar situation also occurs in position 4. Above example tells us if there are some fixed patterns that appear in the sequence repeatedly, their i will be larger and the entropy rate will be smaller.In the extreme case of a sequence whose symbols are all unique, i = 1 for all symbols and n i=1 i = n .In this case, the entropy rate is maximum, equal to the Shannon entropy of sequence, which is log 2 (n) .It also have been proven that H est will converge to the actual entropy rate when n approaches infinity [40].

Approximate entropy
Approximate entropy is a statistic quantifying the regularity and complexity of short and noisy time series data [41].It originated from the analysis of complexity in dynamic systems and is seen as the information-theoretic rate of entropy for approximating Markov chains [42].Approximate entropy measures the logarithmic probability that nearby pattern runs remain close in the next incremental comparison.The calculation of approximate entropy requires two parameters, which are m, the length of the template, and r, a noise filter.Statistically, it would be the equivalent of dividing the space of states into cells of width r, to estimate the conditional probabilities of the m-th order.
Given a sequence of data . The distance between s m (i) and s m (j) is defined as the maximum difference in their respective scalar compo- nents, which is (4) We define a quantity named correlation integral, which is the average number of subsequences similar to s m (i) .Then we compute, Finally, the approximate entropy of the sequence is From Eq. ( 8), we can see that approximate entropy is inversely proportional to the conditional probability that similar subsequences of length m stay consistent at the next position.Greater likelihood of remaining close, implying regularity, produces smaller ApEn values, and conversely.
Pincus showed that approximate entropy would converge to the entropy rate for independent identical distribution series and finite Markov chains [16].However, this does not hold in more general cases, since approximate entropy is designed as a relative measurement used to compare the regularity of different time series.Approximate entropy can vary significantly with the choice of m and r, but the relativity of approximate entropy is enough to discriminate different systems.Pincus [43] pointed that, in general, given two data series X 1 and X 2 , when ApEn(m 1 , r 1 )(X 1 ) < ApEn(m 1 , r 1 )(X 2 ) then ApEn(m 2 , r 2 )(X 1 ) < ApEn(m 2 , r 2 )(X 2 ) .Additionally, the selection of appropriate values for m and r is crucial in ensuring accurate estimation of the conditional probability from data series of length n.It is recommended that m has a relatively low value, e.g., 2 or 3, since a reasonable estimation of conditional probability needs preferably 30 m points.The value of r could be proportional to the standard deviation of the series.

Sample entropy
Approximate entropy is biased statistic.The bias arises from the calculation of correlation integral C m i (r) , which allows each subsequence to count itself to ensure the loga- rithms remain finite.As a consequence, the conditional probability is overestimated.If we call B i is the number of subsequences with length m that are similar to subsequence s m (i) , and A i is the number of subsequences with length m + 1 that are similar to sub- sequence s m+1 (i) .The approximate entropy calculate (A i + 1)/(B i + 1) as conditional probability, which is greater than the real one A i /B i .This bias is obviously more impor- tant for series with a small number of points n.
Richman et al. [17] defined sample entropy, a statistic which does not have self-counting and eliminates the bias of approximate entropy.And the calculation procedure of sample entropy is simpler than approximate entropy.We define: ).
where A is the counts that two subsequences are similar with length m, and B is the counts that two subsequences are similar with length m + 1 .By constraining j = i , self- counting is avoided.Then sample entropy is calculated as: Since A is always less than or equal to B, the ratio A/B is an unbiased conditional probability less than or equal to unity.In addition to self-counting, another difference between SampEn and ApEn is the position of logarithm.The sum of all subsequences is inside the logarithm in SampEn and outside in ApEn.This operation reduces the probability of undefined logarithms when self-counting is not allowed.Sample entropy demonstrates improved relative consistency in comparison to approximate entropy and provides a more effective means of quantifying regularity in a system [17].
In practice, entropy rate is commonly used to measure regularity of categorical time series, such as human mobility and human online life.Approximate entropy and sample entropy are used extensively in physiological and medical applications.However, there has been a lack of studies that apply these entropy-based metrics to assess the regularity of human physical activities.As a result, the suitability of these entropy models in measuring the regularity of human activities remains to be verified.

Adapting entropy rate to samples with continuous values
To apply entropy rate to longitudinal sensor data that can reflect an individual's physical activities, we modify the estimation procedure of entropy rate.As outlined in Eq. ( 4), calculating i is essential for estimating entropy rate.In its original definition, i represents the length of the shortest subsequence which starts from position i and never exists previously.While this definition is appropriate for categorical time series, it is too rigid for time series with continuous values, such as step counts or calorie expenditures, where small numerical differences can still be considered as equivalent status.Therefore, we generalize the definition of i by replacing existence with similarity.Based on distance function d[s m (i), s m (j)] in Eq. ( 5), we consider two subsequences are similar if d[s m (i), s m (j)] ≤ r .i is redefined as the length of the shortest subsequence which starts from position i and never exists similar subsequences previously.This modification not only enables the entropy rate to be applied to continuous series, but also makes the comparison among entropy rate, approximate entropy, and sample entropy in a more fair manner, due to the same parameter r.For simplicity, we use the entropy rate to denote the modified entropy rate in subsequent sections.(

Validation of entropy models
In this part, we validate the applicability of entropy rate, approximate entropy, and sample entropy for measuring the regularity of human physical activities using longitudinal sensor data.

synthesized physical activity data
To validate the applicability of entropy rate, approximate entropy, and sample entropy, we need to know the real regularity of longitudinal sensor data, or at least the relative regularity among these data, as the ground truth.However, the diversity of individual lifestyles results in a mixture of varying kinds of regularity and a multitude of random noises in real-world sensor samples.This makes it challenging to manually distinguish which samples exhibit a higher degree of regularity.Despite the complexity of real-world data, it is possible to construct synthesized physical activity data with controlled randomness to obtain a relative regularity.Moreover, the synthesized data should be considered as a simulation of activity patterns that occur in real life.An analysis of a real-world dataset is depicted in Fig. 1, which reveals the average step counts per minute within a week across hundreds of users.Similar shapes from Monday to Friday indicate distinct circadian rhythms existed in human physical activities.Three prominent peaks can be identified in the morning, noon, and evening of weekdays, which indicates collective exercise preference.Based on these observations, a basic activity pattern was constructed for the synthesized data, that is, exercise at preference time every day.The inherent randomness in human life, however, affects the occurrence time, duration, and intensity of exercise from day to day.Additionally, trivial activities in daily life, such as housework, may also occur randomly.By controlling the degree of randomness in these elements, synthesized physical activity data can be generated with known relative regularity.
More specifically, we model synthesized physical activity data as the superposition of two types of physical activities, which are exercise and trivial activity.Exercise usually lasts for a long period of time with a steady intensity, and is accompanied by some Fig. 1 Average step counts per minute within a week across users from a real-world dataset.On the one hand, the similar shapes are on weekdays or weekends suggesting a clear circadian rhythm in human activity.On the other hand, the patterns on weekdays and weekends are markedly different preference in terms of timing.The trivial activity, by contrast, only lasts a few minutes, fluctuates in intensity, and occurs randomly throughout the day.Mathematically, synthesized data of one day can be expressed as where E i denotes i-th exercise of the day, and its parameters, t i , d i , and int i represent occurrence time, duration, and intensity of the exercise, respectively.TA j is the j-th triv- ial activity of the day.Trivial activities can be considered as noise in daily life and are assumed to be independently and identically distributed.A complete synthesized sample consists of data from several consecutive days.For simplicity, we assume that data from different days are independent.
Initially, we consider a completely regular scenario where exercise of constant intensity and equal duration are performed at the same time every day.Taking the minutelevel step counts data as an example, three exercises that occur every day at 8:00, 12:00, and 20:00 with an intensity of 100 steps per minute and a duration of 60 min constitute a very regular sample.In this case, M is set to 3, and the parameters of exercise are also fixed on daily basis.To simulate the variability of exercise, we add normally distributed disturbances to the occurrence time, duration, and intensity of exercise.These disturbances are denoted as dis t ∼ N (0, σ t ), dis d ∼ N (0, σ d ), dis int ∼ N (0, σ int ) , respec- tively.The standard deviation determines the degree of disturbance.A smaller standard deviation results in a more regular exercise pattern, while a larger standard deviation increases the level of disruption to the regularity.For trivial activities, their duration are generated from a geometric distribution with a mean of 3 min, and they are randomly placed throughout the day.The step counts per minute in trivial activities are sampled from a uniform distribution range from 20 to 150.Due to the random nature of trivial activity, an increase in the number of trivial activities leads to a decline in regularity.Therefore, the number of trivial activities, N, can be also used to control the degree of disturbance, like standard deviations.In Fig. 2, we visualize four synthesized samples, and each sample contains only one specific disturbance.The title of subplot indicates the type and degree of the disturbance.The x-axis is the time of day, and each line in the subgraph is the step counts waveform of a day.The y-axis is the number of days.
Using different parameters, we construct 2-week synthesized samples under these disturbances.The standard deviation of occurrence time ranges from 0 to 180 min in 30-minute increments and the standard deviation of duration ranges from 0 to 60 min in 10-min increments.The standard deviation for intensity, measured in terms of step counts per minute, is varied from 0 to 100 in increments of 10.The disturbances from normal distributions are limited to ±σ to avoid cases of overlapping of exercise or mean- ingless values such as negative duration or step counts.The number of trivial activities per day ranges from 0 to 50.For each parameter of disturbances, we construct 100 samples.Fig. 3 shows the entropy rate, approximate entropy, and sample entropy of synthesized samples with different parameters.The noise filter r equals 10 to distinguish the smallest disturbance and the length of template m is 2, as [43] suggested.The solid lines represent trends of average value, and the shaded parts are 95% confidence intervals.( From Fig. 3a and b, we can find that approximate entropy and sample entropy remain unchanged with different disturbances on exercise's occurrence time and duration.Although the duration and occurrence time of exercise varies, the conditional probability that similar subsequences of length m stay consistent at the next position is almost invariable.This stability is a result of the reliance of approximate entropy and sample entropy on the evolution of small-scale subsequences.Variations in the duration and occurrence time of exercise can be considered as alterations in macro regularity, and they are difficult to distinguish at a small-scale.In the analysis of approximate entropy and sample entropy, the length of template m determines the size of subsequences.Due to the small value of m, approximate entropy and sample entropy are less sensitive to such macro irregularities.While the entropy rate increases with the degree of disturbance.The entropy rate increases greatly from zero disturbance to a small disturbance, then the increment of entropy rate becomes smaller with the increment of disturbance.We can understand this phenomenon from the perspective of lossless data compression.For the completely regular sample, all activities are exactly the same and can be compressed as one symbol.However, a small disturbance leads to minor variations among activities, resulting in the need to compress each unique activity as a separate symbol.As the disturbance level increases, the number of identical activities decreases and all activities must be compressed as distinct symbols, leading to a gradual decrease in the growth rate of entropy rate.Fig. 3c shows the performance of three entropy models on Fig. 3 Entropy rate, approximate entropy, and sample entropy of synthesized samples under different degree of disturbance.A good metric should monotonically increase with the degree of disturbance, and only the entropy rate satisfies this expectation synthesized samples under different degree of disturbance on exercise's intensity.The disturbance can be regarded as noise of varying magnitudes applied to the original signal.It is observed that the approximate entropy increases as the magnitude of the noise increases.The larger the noise, the wider range of values at each time points.And the probability that similar subsequences are still close at the next moment becomes smaller.Entropy rate is also increasing and leveling off faster.This trend is similar to Fig. 3a and b for the same reason.We zoom in on the sample entropy curve in Fig. 3c, which increases and then decreases as disturbance increases.This is because sample entropy sums over the numerator and denominator of all subsequences separately when calculating the conditional probability.As shown in Fig. 3d, all three entropy models increase with the increment of trivial activities, which means all of them are able to distinguish the amount of noise in physical activity data.
Summarizing the performance of three kinds of entropy on synthesized physical activity data, we can find that approximate entropy and sample entropy can not distinguish the macroscopic variation of physical activity, such as duration and occurrence time, as the entropy rate can.Additionally, approximate entropy and entropy rate are able to identify the magnitude and amount of noise in physical activity data, while sample entropy can only identify the amount of noise.

Real-world physical activity data
The synthesized physical activity data can be used to validate the applicability of entropy rate, approximate entropy, and sample entropy under specific situations.However, it is not sufficient to evaluate the applicability of these three kinds of entropy based only on the results of synthesized data, given that the real situation is more complex than synthesized data.A demonstration of how well these three types of entropy perform on real-world physical activity data will be more convincing.Although it is hard to know the real regularity of real-world samples, predictability can serve as a proxy for regularity.Since good prediction rely on capturing the genuine patterns and relationships which exist in the historical data [44], regular data is more predictable.
Predictability is a measure of how well future values of a time series can be forecasted [45].To evaluate the predictability of a time series with length n, a forecasting approach is employed starting at an initial position i.The i-th data point is predicted based on the previous data points, and this procedure is repeated until i = n .The predictability of the time series is determined by the average error of the predictions made at each position.It is important to note that the predictability of a time series is not only dependent on the degree of regularity present in the series, but also on the choice of forecasting models.To mitigate the effect of model choice on predictability, we employ four widely used time series forecasting models, including both classical statistical models and deep learning models, as follows.
• Exponential smoothing (ES).Forecasts produced using ES are weighted averages of past observations, with the weights decaying exponentially as the observations get older [44].ES method can capture the trend and seasonality of time series by applying exponential smoothing recursively [46].
• Prophet.Prophet forecasts time series based on an additive model where non-linear trends are fit with yearly, weekly, and daily seasonality, plus holiday effects [47].• Recurrent neural networks (RNN).RNN is a class of neural networks allowing output from nodes affects subsequent input, which is able to capture long-term temporal dependencies.Long Short-Term Memory (LSTM) [48] and Gated Recurrent Unit (GRU) [49] are two popular variants of RNN, which have been shown to achieve state-of-the-art results in applications with time series.• N-BEATS.N-BEATS is a deep neural architecture based on backward and forward residual links and a very deep stack of fully-connected layers [50].The residual connection enables the stacks to focus on predicting errors from the previous stack, which implements an automatic time series decomposition.
We first conduct these forecasting models on synthesized physical activity data to assess the feasibility of using predictability as a proxy for regularity.To reduce computational overhead, the data is first reduced from minute-level to hour-level by summing the values per minute within one hour.In this way, the total length of 30-day synthesized samples is reduced to 720.We historically forecast the entire sample starting from the first 1/3 time steps, so that the forecast horizon is long enough and sufficient historical data is available for the initial forecast.The raw series of each user are first normalized before inputting forecasting models.Then, these forecast values are denormalized and used to calculated prediction errors with raw data.The mean absolute error (MAE) is used as a metric to evaluate the forecasting performance of the models.Since the impact of disturbance on intensity will cancel out after the summation process, we validate the performance of forecasting models in hourly synthesized samples with the presence of other three kinds of disturbances.The results, as shown in Fig. 4, demonstrate that the MAE of all models increases with the increase of the disturbance level.The trend of prediction error is consistent with our expectation of regularity, and this indicates that the prediction error can serve as a reliable proxy for regularity.At the same time, the performance of forecasting models varies in order under different disturbances.This indicates that the regularity captured by different forecasting models is different and it is necessary to employ multiple forecasting models to evaluate entropy models.
We then present three real-world physical activity datasets.(1) SJTU dataset.This dataset contains minute-level step counts data of 686 users over 30 consecutive days, Fig. 4 Performance of forecast models on hourly synthesized physical activity data.MAE of all models increases with the increase of the disturbance level, and this indicates that the prediction error can serve as a reliable proxy for regularity collected through a smartphone application called SJTU Health with the consent of users.It includes 205 male users and 481 female users, and the median age of all users is 42.(2) Fitbit dataset.This dataset is generated from thirty eligible Fitbit users who consented to submit approximately one month of personal tracker data, including minute-level step counts, calorie expenditures, and exercise intensities.The values of intensity range from 0 to 3, representing sedentary, lightly active, moderately active, and very active respectively.The intensity classification is determined by proprietary algorithms from Fitbit.(3) Lifesnaps Fitbit dataset.This data set collected hourly calorie expenditures from 71 participants over 4 months, including 42 male users and 29 female users.We only use the data of first month in order to be consistent in data length with other datasets.In Table 2, we present the mean and standard deviations of all three datasets at different time periods (morning: 6 am-10 am, noon: 10 am-14 pm, afternoon: 14 pm-18 pm, evening: 18 pm-22 pm, night: 22 pm-6 am).Such statistical information can tell us the distribution of user activities and help us understand the subsequent prediction results.
We historically forecast these real-world samples from different users after scaling them to hourly data, just as we forecast synthesized samples.The results, as presented in Table 3, demonstrate the mean and standard deviation (Std) of the MAE of each forecasting model on real-world datasets.Among all models, RNN exhibits the lowest average prediction error on all datasets.The second-best performance is achieved by the N-BESTS.These deep learning-based models perform better than traditional models, like ES and Prophet.
Before calculating entropy rate, approximate entropy, and sample entropy for these real-world samples, the noise filter r needs to be assigned.The value of r should be chosen carefully, as a small r can result in a high sensitivity to noise, leading to the masking of real regularities by tiny fluctuations.On the other hand, if r is set too large, it will reduce the ability to distinguish noise and result in all entropy values being the same in an extreme scenario where r approaches infinity.Here, we did not make r proportional to the standard deviation of each series as suggested in [43], since it leads to different noise filter for different individuals and the entropy values of different individuals will loss comparability.Therefore, we choose a unified noise filter r for all users, which is equal to the standard deviation of all data points.And we also test different values of r in subsequent analysis.For approximate entropy and sample entropy, the length of template m is 2, as [43] suggested.
After obtaining prediction errors and three kinds of entropy for all real-world samples, we calculate the Spearman correlation coefficient between prediction error and entropy to determine the best regularity measurement.The Spearman correlation coefficient is a statistical measure of the strength of a monotonic relationship between paired data.Since regular data can be predicted better, samples with smaller prediction errors should have smaller entropy.Among three kinds of entropy, the entropy with the highest Spearman correlation coefficient is considered as the best regularity measurement.We also test some rhythm-based and stability-based metrics mentioned in related work, the detailed results are presented in Supplementary Information (Additional file 1).
In Fig. 5, we display the scatter plots of prediction error versus three kinds of entropy for four forecasting models in the SJTU dataset.Each subplot is titled with the Spearman correlation coefficient between prediction error and entropy.The results show that for all forecasting models, the correlation between entropy rate and prediction error is stronger compared to that of approximate entropy and sample entropy.For all kinds of entropy, the correlation coefficient between prediction error of the best-performing RNN model and entropy is greater than other models.The correlation coefficient between entropy rate and error of RNN is the highest, which is 0.8951.Figure 6 presents correlation coefficients between entropy and prediction error on Fitbit dataset and Lifesnaps Fitbit dataset.It is observed that the correlation Fig. 5 Scatter plot of MAE and entropy values on SJTU dataset.Each subplot is titled with the Spearman correlation coefficient between prediction error and entropy values.For all forecasting models, the correlation between entropy rate and prediction error is stronger compared to that of approximate entropy and sample entropy coefficients between sample entropy and prediction error are notably lower than those between entropy rate and approximate entropy.The performance of approximate entropy is found to be comparable to that of entropy rate on Fitbit (calorie) and Fitbit (intensity).On Fitbit (step counts) and Lifesnaps Fitbit(calorie), mean values of correlation coefficients between prediction errors and entropy rates for different forecasting models are slightly higher than that of approximate entropy.And the majority of their correlation coefficients are above 0.9.
We also perform a sensitivity analysis to investigate the impact of different noise filters r on the performance of three entropy models.Figure 7 presents the mean of correlation coefficients between prediction errors from different models and entropy under varying r in the SJTU dataset.The results reveal that the average correlation coefficients of all three entropy types initially increase and then decrease as r increases.This phenomenon is consistent with our previous analysis that entropy value calculated from both too small and too large r can not reflect the real regularity.Additionally, entropy rate displays a higher correlation coefficient than approximate entropy and sample entropy across a wide range of parameters.Especially when r is small, the advantage of entropy rate is more obvious.Moreover, with the increase of r, the correlation coefficient of Fig. 6 Correlation coefficients between prediction error and entropy on Fitbit dataset and Lifesnapes Fitbit dataset.The correlation coefficients of entropy rate and approximate entropy are significantly higher than the sample entropy.Meanwhile, entropy rate is slightly better than approximate entropy entropy rate does not decline significantly.These indicate that the entropy rate is more robust to the choice of parameter r.
In conclusion, entropy rate is more relevant to the prediction error of real-world physical activity samples than approximate entropy and sample entropy.Combining their performance on synthesized data, entropy rate is more suitable than approximate entropy and sample entropy to measure the regularity of human physical activity.

Interpersonal variation of physical activity regularity
Based on entropy rate, we quantify the regularity of physical activity for different individuals.As shown in Fig. 8, we present the distribution of entropy rate of minute-level Fig. 7 Average correlation coefficient between prediction errors and entropy under varying noise filters in the SJTU dataset.Entropy rate displays a higher correlation coefficient than approximate entropy and sample entropy across a wide range of parameters.These indicate that the entropy rate is more robust to the choice of parameter r Fig. 8 Distribution of entropy rate across users in the SJTU dataset data for all users in the SJTU dataset.The average entropy rate of all users is 0.066 bits.In this context, entropy rate refers to the average amount of newly generated information for each update of the user's physical activity state.A entropy rate of 0.066 bit can be interpreted as the user's physical activity state of next minute could be found on average in any of 2 0.066 ≈ 1.046 states, which also means the user's physical activity state is determinable most of the time.These results can be explained by the fact that sedentary and restful activities typically occupy a significant portion of the day for most individuals.The SJTU dataset reveals that users on average have 1280 min per day with 0 step counts.Additionally, the ordered structure of the data, such as prolonged periods of a single state or regular alternations between states, further reduces the number of potential states for the next moment.Figure 9 visualizes step counts data for two specific users with different entropy rates over the month-long observation period.User 1 displays a clear exercise pattern, characterized by consistent daily walks at set times in the morning, noon, and evening.Conversely, the data for user 2 exhibits a more unpredictable and chaotic pattern, with Fig. 9 Case study of two specific users with different entropy rate over the 30 days.User 1 displays a clear exercise pattern, and is more regular than user 2 who exhibits a more unpredictable and chaotic pattern various fluctuating activities that occur at irregular times.As a result, the entropy rate for user 2 (0.1015 bit) is significantly higher than that for user 1 (0.0411 bit).
We also explore factors that may contribute to variations in regularity among individuals.To this end, we derive two features from the original physical activity data for all users.The first feature is the daily step counts, which serves as a measure of the total amount of physical activity.The second feature is the daily duration of trivial activities, which can be used as a proxy for the composition of physical activity.The trivial activity is determined based on duration and intensity.Physical activities that last less than 10 minutes or 1000 step counts are classified as trivial activity, because activities with more than 1000 step counts in 10 minutes are generally regarded as effective exercise [51].Figure 10 displays the relationship between entropy rate and these two features for all users in SJTU dataset.When daily step counts is low, the entropy rate tends to be low as well, as there is limited physical activity.However, as the daily step count increases, the entropy rate exhibits a wider range of regularity.In general, daily step counts is weakly correlated with entropy rate, and the correlation coefficient is 0.4403.On the other hand, the average duration spent on trivial activities per day shows a great positive correlation with entropy rate, and the correlation coefficient is 0.7290.Trivial activities tend to exhibit more randomness in both the occurrence time and intensity compared to exercise, which leads to a decrease in regularity as the time spent on trivial activities increases.Overall, the results suggest that the regularity of human physical activity is not determined by the amount of activity, but rather by the composition of activities.

Intrapersonal variation of physical activity regularity
Intrapersonal variation of regularity describes temporal variability of the same individual's physical activity habits.For each user in SJTU dataset, we recursively slide through the entire sequence in a 14-day window (e.g.day1-day14, day2-day15, ..., day17-day30).In this way, we divide user's original time series of 30 days into 17 two-week time periods.The entropy rate of consecutive time periods forms an entropy rate sequence, which indicates how an individual's regularity changes over time.Figure 11 illustrates entropy rate sequences from three users with different degrees of variability.We utilize  Table 1 An example illustrating the estimation of entropy rate in the last several time periods, and it has a CV value of 0.10.For user 3, the entropy rate increases significantly over time in the second half.Therefore, it has the highest CV value of 0.20.
We also calculated the coefficient of variation o entropy rate sequence for all users, and plot its cumulative distribution function (CDF) in Fig. 12.We can find that about 40% of users have CV values less than 0.05, and more than 80% of the users have CV values less than 0.1.This indicates the majority of people have stable physical activity habits.

Discussions
With fine granular data collected over an extended period of time, we are offered the opportunity to study the regularity of human activities at a granularity much finer than that in conventional studies, such as the ones in [6,[33][34][35].Entropy models, and in particular, entropy rate, with the ability of exploring short and prior unknown subsequences, are thus natural tools to consider.As our results with the synthesized and realworld data reveal, entropy models can provide a unique picture of regularity, in both interpersonal and intrapersonal situations.The simplicity of this regularity measure also allows for different uses.For example, it may be used as a reliable component in a complex user activity profile, or, it can be an easy to use indicator in population level health interventions.Our study enriches the applications of entropy models and thus well complements existing studies in which entropy models are used in human mobility [11], physiology [17], medicine [18] and climatology [23] etc.
The complexity in human physical activities, as demonstrated in the fine granular data, makes it very difficult to interpret the entropy models that are highly abstract, and thus the effectiveness of these models is difficult to evaluate.The synthesized data, and the experiments based on them, serve as a first crucial link between the tools and the physical world.The behavior of the entropy models can be well observed by controlling the  easy to interpret randomness in the synthesized physical activities.Our study show that entropy rate can identify not only the magnitude and the amount of noise, but also macroscopic variations of physical activities, such as differences on duration and occurrence time, which can not be recognized by approximate entropy and sample entropy.To further validate the effectiveness of these models, they must be tested on real-world data.However, with no prior knowledge of the "regularity" about the real-world dataset, anything calculated on the data samples are out of context.By correlating the entropy values of the real-world samples with their respective predictive performance, which, in practice, is considered as one proxy of regularity [45], the second crucial link is established.On several real-world datasets, entropy rate also exhibits stronger correlation with predication errors compared to approximate entropy and sample entropy.We then conclude that entropy rate is a reliable measurement of human physical activity regularity.
Our study on the interpersonal and intrapersonal variations of regularity with entropy rate demonstrates some interesting facts.Despite the seemingly complex data, their entropy values are generally low.This well agrees with the fact that sedentary and restful activities typically occupy a significant portion of the for most individuals.An average entropy rate of 0.066 may also suggest that in theory, the number of bits needed to record the physical activity of an ordinary person may be small.
One limitation of this study, however, is that the data recorded by wearable or other smart devices have their inherent limits.Some activities may fail to be recorded when the device is not carried along by its owner.Furthermore, the data recorded by wearable sensors may also have some deviation from the actual situation [52].Although entropy rate can tolerate a portion of the recording deviation by noise filter r, when the deviation exceeds r, it leads to an inconsistency between the computed regularity and the true regularity.And, it is worth noting that in our experiments, we used a fixed r for all persons to ensure a fair comparison of regularity across them.In a scenario where the association between physical activity regularity and individual health outcomes is investigated, personalized noise filter would be more appropriate.

Conclusions
In this study, we explore the feasibility of using entropy models to measure the regularity of human physical activities.Through experiments on both synthesized and real-world dataset, we found that entropy rate can be regarded as a more reliable measurement for regularity of human physical activities than approximate entropy and sample entropy.On synthesized physical activity data with controlled randomness, entropy rate exhibits the ability to identify not only the magnitude and amount of noise but also macroscopic variations of physical activities.On real-world physical activity datasets, entropy rate is closely tied to the predictability of samples.The strong correlations between entropy rate and prediction errors from various forecasting models demonstrate its applicability in measuring human physical activity regularity.In future work, it would be of interest to develop a multifaceted approach to assessing the regularity of human activities, leveraging diverse data sources such as activity data, weather data, and external incentives.Such an integrated approach could provide a more comprehensive and nuanced understanding of the factors shaping physical activity patterns and inform the development of targeted interventions to promote regular physical activity engagement.

Fig. 2
Fig. 2 Visualization of synthesized physical activity samples with four types of disturbance.a Normally distributed disturbance with a standard deviation of 60 min was added to the occurrence time of exercise.b Normally distributed disturbance with a standard deviation of 60 min was added to the duration of exercise.c Normally distributed disturbance with a standard deviation of 50 step/minute was added to the intensity of exercise.d Trivial activities were added to daily exercise Figure 8 also indicates distinct variations exist among individuals' regularity.Such interpersonal variability on regularity reflects different lifestyle preferences of people.

Fig. 10
Fig. 10 Scatter plot of regularity and characteristics of physical activities across all users in SJTU dataset

Fig. 11
Fig. 11 Illustration of entropy rate sequences from three users.The dotted line represents the mean value of sequence

Table 2
Statistical information (mean ± standard deviation) of all datasets at different time periods

Table 3
MAE (mean ± Std) of forecasting models on real-world datasets