Skip to main content

Detecting high indoor crowd density with Wi-Fi localization: a statistical mechanics approach


We address the problem of detecting highly raised crowd density in situations such as indoor dance events. We propose a new method for estimating crowd density by anonymous, non-participatory, indoor Wi-Fi localization of smart phones. Using a probabilistic model inspired by statistical mechanics, and relying only on big data analytics, we tackle three challenges: (1) the ambiguity of Wi-Fi based indoor positioning, which appears regardless of whether the latter is performed with machine learning or with optimization, (2) the MAC address randomization when a device is not connected, and (3) the volatility of packet interarrival times. The main result is that our estimation becomes more—rather than less—accurate when the crowd size increases. This property is crucial for detecting dangerous crowd density.


Crowd disasters have taken many human lives. The Love Parade disaster in Duisburg, 2010, the Ellis Park Stadium disaster in Johannesburg, 2001, the PhilSports Stadium stampede in Manila, 2006, are just a few examples. One of the major factors contributing to crowd disasters are critically dense spots [1,2,3], which are difficult to detect due to lack of macroscopic overview of the crowd [1]. In this paper we address the problem of estimating the crowd density distribution in situations such as indoor dance events, to enable prevention of crowd disasters.

A lot of research on estimating crowd density concerns processing video records from security cameras [4, 5]. However, this approach does not suffice to detect critically raised crowd density. Firstly, as mentioned before, it is difficult to obtain macroscopic overview of the crowd. Secondly, the lighting conditions at a concert might not be sufficient for video-based crowd analysis. Finally, the error of counting people increases with the increase of the actual crowd density [6] due to the so-called occlusion effects. Another way to monitor the crowd density is by using RFID technology [3]. Each participant is asked to wear a tag, and RFID readers are distributed across the venue. This approach, however, requires participation from the crowd and deployment costs. Similar requirements exist for other wireless tracking technologies, like Bluetooth or GPS-based. In addition, GPS is not so suitable for indoor localization. (For more information on crowd monitoring services, we refer the reader to [3].).

In our approach, which can complement the video-based analysis, we exploit the ubiquity of smart phones, as it has been done in [7,8,9,10,11,12,13]. More concretely, our approach is non-participatory, that is, it does not require participation from the crowd, and uses the already existing Wi-Fi network at the venue.

Despite the recent success in using wireless technologies for indoor positioning and crowd counting, several problems remain open. Firstly, there is the problem of ambiguity when attempting wireless indoor localization [14,15,16,17], which is a major source of localization errors. Secondly, when a phone is not ‘connected’, its Media Access Control (MAC) address may change (be “randomized”) over time [18], complying to privacy policies, thus making it impossible to track the user over time. Finally, since we do not rely on crowd participation, the signals from the phones are quite irregular in time [19], meaning that real-time tracking of a device is also challenging.

In this paper we address the aforementioned three problems as follows. To address the ambiguity problem, we apply concepts from statistical mechanics: rather than estimating the most likely position of a visitor in real time, we create an evolving probability distribution over all possible positions of the visitor. We rely on the fact that we have a lot of data (or visitors), to estimate the crowd density by aggregating the individual distributions. We use the abundance of data again and the fact that the structure of the MAC address reveals whether it has been randomized, to account for the fact that a portion of the devices are not trackable. Finally, we deploy a time-out based memory model for dealing with the volatile signal rates.

Applying primarily the law of large numbers leads to our main result; namely, that our estimation becomes more (rather than less) precise when the crowd size increases, even without requiring crowd participation. This property is crucial for being able to detect critically raised crowd density.

The rest of the paper is organized as follows. “Background” section explains briefly our data collection process and the Wi-Fi localization methods that we use. “Problem statement” section introduces in more detail the problems related to crowd density estimation. “Method” section proposes a new method for crowd density estimation that addresses the mentioned problems. In “Results and discussion” section the performances of our method are analyzed and discussed, including comparing the method to related work. “Concluding remarks” section ends with conclusions and directions for future work.


Data collection and privacy protection

The in-house data and videos used in this paper were collected during the sensation 2015 dance event in the Amsterdam ArenA (today Johan Cruijff ArenA) football stadium. More than 30,000 visitors were present, and 28847 MAC addresses were detected in the range of the Wi-Fi access points (AP’s). We used 30 AP’s distributed in the east corner and in the west side of the stadium. (The white dress code of this particular dance event in 2015 made it suitable to evaluate with video data.) We processed the Wi-Fi signals and estimated the coordinates of the Wi-Fi enabled devices using a method similar to trilateration, that we explain in the next subsection.

Usage of smart phones to identify the user’s locations inevitably raises the question of privacy concerns. The system that we use has been designed from the ground up with privacy in mind—no privacy-sensitive data is ever stored. Only a minimal set of data is collected (timestamp, access point, signal strength and identifier). The unique MAC identifiers of the phones are hashed (anonymized) on reception. The data is not stored on site, but passed on streaming to a trusted third party. The third party maps the hashed identifiers once again. The final identifier is stored in an environment accessible only to the data scientists participating in this project. As a result, none of the involved parties has sufficient information to recover the original MAC address.

Following the European Union General Data Protection Regulation (GDPR) and the Dutch law for handling personal data, laid down in the Personal Data Protection Act (Wbp) (Section 12.2) [20], we do not publish any part of the data, and only reveal statistical and aggregated results about the crowd. The data analytics Jupyter Notebook scripts together with the output (aggregated results) that led to the insights and research results presented in this paper can be found in [21].

Localization of smart phones using Wi-Fi sensors

Smart phones transmit Wi-Fi signals which are captured at the Wi-Fi access points. The captured signals contain information about the measured received signal strength (RSS). Widely used methods for positioning using RSS values are multilateration and fingerprinting [22]. In what follows we give a brief overview of the two methods.

Localization by multilateration

Using the Friis equation for the relationship between an RSS and the distance between a transmitter and a receiver, the distance between a smart phone and an AP can be estimated. When we have the exact distances from a smart phone to at least three AP’s, the position of the smart phone can be uniquely determined at the intersection of three circles (Fig. 1a) [23].

Fig. 1
figure 1

Estimating position with trilateration: a precise, b rough

However, in practice the measured RSS values contain unpredictable variation due to noise and interference such as absorption and reflection by obstacles (e.g. human bodies). As a result, the circles do not intersect at a unique point (see e.g. Fig. 1b). In this case, an optimization procedure is undertaken for positioning the smart phone using more than three APs and, in our case, all received RSS values within 500 ms. We use the least-squares optimization method, which in our case has the form of a Chi-square data fit. An exact description of the method is beyond the scope of this paper, and we refer the reader to [24, 25] for details. We note, however, that the statistical estimation of the position provides us also with the standard deviation, which we will use in “Method” section.

Localization by fingerprinting

Traditional fingerprinting is also RSS-based. There are ‘offline’ and ‘online’ phase of the localization. In the ‘offline’ phase, for a moving client device, the signal strengths from several access points in range are continuously recorded and stored in a database, along with the known coordinates of the device [26, 27]. During the ‘online’ tracking phase, the current RSS vector of a device at an unknown location is compared to the vectors stored in the fingerprint, and the closest match is returned as an estimation of the location. The closest match is usually determined through probabilistic methods (e.g. expectation maximization, KL-divergence), or through machine learning techniques (e.g. k-nearest neighbors, Support Vector Machines, neural networks).

Fig. 2
figure 2

a Estimation of the x-coordinate in meters of a static MAC device through time; b RSS-based positioning can lead to multiple local optima

Problem statement

Under ideal circumstances, the positioning itself would suffice to estimate the spatial crowd density distribution: every second we would only need to count the number of detected devices per square meter. However, Wi-Fi based localization comes with the following challenges, which are not related to the mathematical methodology behind the ‘positioning’ step, and that prevent us to apply direct counting.

  1. 1.

    Issue 1: Ambiguity of the localization procedures It has been argued that one of the biggest source of errors, when using RSS values for localization, is the absence of a single global optimum [15]. In the fingerprinting approaches the phenomenon is called “fingerprinting twins” [14, 15], while in the multilateration approaches it is known as “flip ambiguity” [16, 17, 28]. We also sampled randomly 20 MAC addresses from the sensation data, under various crowd conditions, and plotted the estimations of their coordinates through time. We plotted only the estimations with a relatively small (conditional) uncertainty. We observed persistent bi-modal distributions of the estimations through time (as if the device is being tele-ported constantly), an example of which can be seen in Fig. 2a. This figure shows the estimated x-coordinates (in meters) through time of a static MAC device that was persistent for 24 h (most probably an AP). The “twins” phenomenon originates from the fact that the environmental settings and the temporal variations in the RSS create opportunities for multiple local optima [17], in case of trilateration, or geographically distant positions to share the same RSS vectors [15], in case of fingerprinting. (Note: we call them ‘twins’ because the empirical evidence suggests so far that there are no ‘triplets’; however, theoretically the latter are not excluded.) To understand the problem, consider Fig. 2b. In the center of every ring there is an access point with an estimated signal strength to a particular MAC device, with a certain error range. The error range is represented by the width of the ring. Then, there are two possible regions where all three rings overlap (A and B), and that are equally good candidates for positioning the MAC device. Note that this problem can arise regardless of the width of the rings and the number of APs, and that it cannot be alleviated without using additional information or assumption about the visitors or the environment. For example, current solutions require crowd participation [15], or assume crowd mobility [14].

  2. 2.

    Issue 2: Volatility of packet rates When a MAC device is connected to a Wi-Fi network, it sends packets with a relatively stable and frequent rate. However, when the device is not connected, it is in a “probing” mode, i.e. searching for a network, and in this case the packet rate is quite volatile, ranging from a few seconds to a few minutes [19]. This introduces challenges when trying to estimate the total number of devices present at a certain moment.

  3. 3.

    Issue 3: MAC address randomization Due to the ever-increasing privacy concerns and possibly other business reasons, starting from 2014, Apple has introduced ‘randomization’ of the MAC addresses of the phones, when the latter are in probing mode (not connected to the internet) [19]. This means that the devices not connected to the internet are continuously changing their MAC addresses and cannot be followed over time. (It is worth noting, however, that from the format of a MAC address itself it can be determined whether the address is authentic or randomized).


In this section, as a main contribution of the paper, we propose solutions to the issues stated in the previous section.

Estimation under localization ambiguity

Creating statistical ensembles In order to deal with the localization ambiguity, let us start with the following observation: we are not interested in the individual locations of the MAC devices, but rather in the density of the crowd. In addition, for prevention of crowd disasters, it is very important to have a precise estimation when the number of people in a stadium is large and dense spots are likely [1, 2]; at the same time, obtaining precision at a low crowd density is of less priority.

To this end, we propose a probabilistic model for crowd density estimation. To explain our idea, we use the scenario depicted in Fig. 2b. We can say that the particular MAC device is located in region A with a probability of 0.5 and in region B with a probability of 0.5 (we assign the probabilities in a trivial way for the purpose of illustration). Although this approach does not provide us with very useful information about the location of the MAC device, if we apply the same reasoning for all MAC devices, and we add together the spatial probability distributions of all MAC devices, we end up with a spatial distribution of the crowd density. If we assume that the locations of all MAC devices are mutually independent and identically distributed, we can apply directly the standard law of large numbers and conclude that, for a large crowd, the error of estimation of the density per square meter will vanish. (The diffusion animation in [29] provides a nice visual demonstration of the concept.) However, we cannot make those assumptions, because e.g. people tend to go to concerts in groups, that is, their locations are correlated [30, 31]. In this case, the variance of the estimation in the limiting case is equal to the average covariance between the locations. In “Results and discussion” section we will show that the average covariance still tends to zero as the crowd size increases, because for a regular concert crowd, the group size is relatively small compared to the entire crowd.

Computing individual probability distributions Next, for an arbitrary MAC device m and region R, we proceed with defining \(Prob(m \in R)\), that is, the probability that m is located in R, in order to be able to evaluate the estimated crowd density per region.

The localization (“fitting”) provides us with a series of estimated positions for a mobile device. We estimate the spatial probability distribution for the device along a moving time window. Our first step at time t is to select the N estimated positions whose time stamps fall within a specified time window \([t-\Delta t,t]\) from the data. A natural way to construct a two-dimensional probability distribution is to create a histogram, by binning the N positions and normalizing by N. In case the positions have been estimated with multi-lateration, the optimization procedure provides also for a Gaussian error (\(\sigma _{xi}\), \(\sigma _{yi}\)) of any estimate \((x_i,y_i)\) (see “Localization of smart phones using Wi-Fi sensors” section). Therefore, we first “smooth” each of the \((x_i,y_i)\) positions into a bivariate distribution, using a Gaussian kernel with standard deviation (\(\sigma _{xi}\), \(\sigma _{yi}\)). Then, for a MAC device m we generate a two-dimensional probability density function (pdf) by adding up the separate ‘bumps’ and normalizing by N (see Fig. 3 for an example of smoothing a histogram.)

Fig. 3
figure 3

Smoothing a histogram with Gaussian kernels. a Original histogram. b Smoothed histogram

Formally, the implementation of our method is similar to that of kernel density estimation [32, 33]. In our case the amount of smoothing is determined by the uncertainty values \(\sigma _{x}\) and \(\sigma _{y}\). The pdf for a MAC device m at a location (xy) is defined by

$$\begin{aligned} {\hat{f}}_{m}(x,y)=\frac{1}{N}\sum _{i=1}^{N}K\left( (x-x_{i}),\sigma _{xi}\right) K\left( (y-y_{i}),\sigma _{yi}\right) \end{aligned}$$

where the kernel function K is given by

$$\begin{aligned} K(u,\sigma )=\frac{1}{\sigma \sqrt{2\pi }}\exp \left( -\frac{u^2}{2\sigma ^2}\right) \end{aligned}$$

and \((x_i, y_i)\) is the result of positioning at step \(i \in \{1, N\}\).

In order to evaluate \(Prob(m \in R)\), we need to integrate \({\hat{f}}_{m}(x,y)\) for \((x,y) \in R\). The final crowd density estimation at point (xy) is given by

$$\begin{aligned} {\hat{f}}_{T}(x,y)=\sum _{m}{\hat{f}}_{m}(x,y). \end{aligned}$$

Finally, in order to estimate the number of people in region R, we need to integrate \({\hat{f}}_{T}(x,y)\) over the region R.

Remark 1

In our implementation we scale up the individual probability distribution, such that it integrates to one inside the region of the stadium (the concert venue). We assume that if a device is detected by the AP’s, it is inside the stadium, and thus the probability that it is in the stadium should be one. (It is very unlikely that the AP’s have detected devices that are outside, due to the thick walls of the stadium.) In the future we also plan to include the map of the stadium in the calculations, to incorporate the fact that the probability that a visitor is in an inaccessible region is zero.

Note that so far we assumed that in every time window there is at least one estimate for every MAC device present in the stadium. In what follows we explain how we capture the cases when this assumption does not hold.

“Conservation of mass” under packet rates volatility

To address the second issue raised in “Problem statement” section, Volatility of packet rates, we ensure that we do not forget about the MAC devices that were not observed in the last time window. In fact, for every MAC device that was ever observed, until it is observed again, we maintain the old probability distribution. However, we also apply a time-out, that is, if a MAC device has not been observed in a long enough time interval (called ‘memory’ parameter), it is simply removed from the pool of MAC devices.

Estimation under MAC address randomization

The previous discussion assumes that a MAC device does not change its identifier over time. However, as we noted in the last issue in “Problem statement” section, a device in a probing mode might randomize its address during a time window, leading to it being counted twice. To address this problem, we rely once again on the fact that we have a lot of data, and on the fact that we can derive from the structure of the MAC address whether it has been randomized or not. Figure 4a shows the time series of numbers of non-randomized and randomized addresses observed per minute from midnight until around 6:00 a.m. during the sensation concert. We observe that their ratio is stable through time (Fig. 4b); the Pearson correlation coefficient between the time series of randomized and non-randomized addresses is 0.88.

Fig. 4
figure 4

a Number of non-randomized and randomized addresses observed per minute; b linear regression plot of randomized and non-randomized addresses observed per minute

Therefore, when estimating crowd density, we ignore the MAC devices that have randomized addresses and at the end we multiply the crowd density by a factor to account for the discarded MAC devices. This factor is derived from the slope of the linear regression fit of the two time series (Fig. 4b), which in our case turns out to be 0.2, with a standard error of 0.006.

Note also that this proportion should be re-computed periodically, to account for the changing conditions at the smart phones market. In fact, when the crowd is large like in our sensation scenario, the randomization factor can be updated in real time, during the concert hours, by using all data that arrived in e.g. the last hour.

We envision, however, that in the future more people will be connected to the Wi-Fi (and thus the proportion of randomized addresses will become smaller) due to the fact that an increasing number of stadiums across the world offer “smart” services, but also due to the increasing usage of social media to post photos and videos of an event in real-time.

Some studies [19, 34] suggest that it is still possible to follow devices despite MAC randomization, and it would be interesting in the future to see if we can improve our methodology taking those studies into account.

Results and discussion

In this section we analyze our method for crowd density estimation in various manners: analytically, with simulations, and using two real-life datasets. (Note: “Theoretical analysis under correlated groups” section offers a formal validation that readers uninterested in technical details can skip without loss.)

Theoretical analysis under correlated groups

We show formally that the relative error of the crowd density estimation converges to zero when the crowd size increases, despite having correlated groups of visitors (e.g. friends).

Let \(\{mac_1, mac_2,\ldots mac_n\}\) be all n MAC devices detected in the stadium at time t. Due to the results in “Estimation under MAC address randomization” section, where we show how we can safely discard the randomized addresses from the analysis, we can assume here that all n MAC addresses are fixed. In what follows we omit t from the notation for clarity. Let R be an arbitrary region of the stadium. Denote by \(mac_i \in R\) the statement “the device \(mac_i\) is in region R”. Let \(X_i\) be a random variable defined by

$$\begin{aligned} X_i = \left\{ \begin{array}{ll} 1 &{} \text{ if } mac_i \in R \\ 0 &{} \text{ if } mac_i \notin R \end{array} \right. \end{aligned}$$

Denote by X the total number of devices in R detected at time t. Clearly, \(X=\sum _{i=1}^{n} X_i\). Then E(X), the expected value of X is

$$\begin{aligned} \begin{aligned} E\left( X\right)&= E\left( \sum _{i=1}^{n} X_i\right) = \sum _{i=1}^{n}E\left( X_i\right) \\&= \sum _{i=1}^{n}\left( 1\cdot Prob\left( mac_i \in R\right) + 0 \cdot Prob\left( mac_i \notin R\right) \right) = \sum _{i=1}^{n} Prob(mac_i \in R) , \end{aligned} \end{aligned}$$

where by \(Prob(mac_i \in R)\) we denote the probability that \(mac_i\) is in the region R at time t. We will show that the variance of X/n, that is, the variance of the proportion of devices detected in region R out of all detected devices, diminishes when n becomes large (note that the variance of X in the limiting case is out of our interest because in this case E(X) is also potentially infinite). This suffices to show that our method for estimation of crowd density is theoretically sound, given the probabilities \(Prob(mac_i \in R)\). We have

$$\begin{aligned} Var\left( \frac{X}{n} \right) = \frac{1}{n^2} \cdot Var\left( \sum _{i=1}^{n} X_i\right) = \frac{1}{n^2} \cdot \left( \sum _{i=1}^{n} Var\left( X_i\right) + \sum _{i \not = j}Cov\left( X_i,X_j\right) \right) \end{aligned}$$

Let \(\gamma\) be an upper limit on the number of people going to a concert together (i.e. whose locations are correlated). Note that, because the random variables \(\{X_i\}_{i=1}^n\) take values in \(\{0,1\}\), the covariances \(Cov(X_i,X_j)\) take values in \([-1,1]\). Thus, the covariances are upper-bounded (by 1). Denote by \(\kappa \le 1\) the maximal covariance between any \(X_i\) and \(X_j\) and let us write \(i \sim j\) if and only if the owners of the MAC devices \(mac_i\) and \(mac_j\) are in the same group of friends. Then,

$$\begin{aligned} \begin{aligned} \sum _{i \not = j}Cov(X_i,X_j)&= 2 \sum _{1\le i < j \le n }Cov(X_i,X_j) = 2 \sum _{i,j: i \sim j }Cov(X_i,X_j) \\&\le 2 n \cdot \frac{\gamma (\gamma - 1)}{2} \cdot \kappa = n \kappa \gamma (\gamma - 1). \end{aligned} \end{aligned}$$

Here the inequality holds because the maximal number of groups is n and the maximal number of pairs (ij) in a group is \(\gamma (\gamma -1)/2\). Let us denote by \(\nu = \frac{1}{n}\sum _{i=1}^{n} Var(X_i)\) the average variance of \({X_1, X_2, ..., X_n}\) (note that \(\nu \le 1\) from the definition of \(\{X_i\}_{i=1}^n\)). From (6) and (7) we have

$$\begin{aligned} Var\left( \frac{X}{n} \right) \le \frac{1}{n^2}(n \nu + n \kappa \gamma (\gamma - 1)) = \frac{1}{n}(\nu + \kappa \gamma (\gamma - 1)) {\le \frac{1}{n}(1 + \gamma (\gamma - 1))}, \end{aligned}$$

which tends to 0 when \(n \rightarrow \infty\). Note that we have greatly overestimated the covariance with the inequality in (7), which means that in practice the variance converges to 0 much faster than presented.

With the above, we have proven a version of the law of the large numbers that is generally applicable. We formalize our results in the following proposition:


Let \(\{X_1, X_2, ... X_n\}\) be random variables that always take values in a bounded real interval. Suppose that the set \(\{X_1, X_2, ... X_n\}\) can be partitioned into subsets of maximal size \(\gamma\) (a fixed constant independent of n), such that if \(X_i\) and \(X_j\) belong to different subsets, then \(Cov(X_i,X_j) = 0\). Let \(S_n = \frac{1}{n} \sum _{i=1}^{n} X_i\). Then

$$\begin{aligned} \lim _{n\rightarrow \infty } {Var(S_n)} = 0. \end{aligned}$$

Remark 2

In our proof we have assumed that correlation happens only within groups of friends. Note that this is a sufficient, however not a necessary condition for the convergence of the variance. A necessary condition is that the average correlation in the crowd tends to zero as the crowd size increases. This allows for (positive) correlations outside groups of friends. Moreover, in a crowded situation negative correlation is more likely to happen, where people move away from the crowd, looking for empty spots [35], which reduces the variance further. In respect of this discussion, however, it is worth noting that there is one singularity scenario, a “crowd crush”, when the crowd is so dense (> 6 persons per \(m^2\)) [2] that people cannot move freely anymore and the entire crowd becomes a “group” (as in the Love Parade disaster), implying that all visitors locations are (positively) correlated. In this work we aim to detect high density with our method way before this saturation happens, to be able to react preventively; otherwise, it is too late.

Remark 3

In our proof we assumed that all MAC addresses are fixed, i.e. that there are no randomized addresses. However, as discussed in “Estimation under MAC address randomization” section, in reality we omit the randomized addresses from the analysis and at the end we multiply the estimation by a so-called randomization factor. Note that, again due to the law of the large numbers, the larger the crowd, the more precise is the estimation of this factor, when it is re-estimated in real time. This means that the above convergence would not be affected by the randomization factor. However, in the following subsection we also confirm this experimentally.

Analysis with simulations

Theoretical analysis under correlated groups” section gives a theoretical validation of the method. We proceed with analyzing the method experimentally, i.e. quantitatively. We are especially interested in how our method performs at dangerously high densities, i.e. more than 4 persons per square meter. Such densities are however difficult to obtain from real life scenarios; moreover, if we use video data as ground truth, the latter is inaccurate at high densities [6]. On the other hand, controlled experiments with thousands of participants and high induced crowd density are beyond the scope of a research paper because of security risks. To validate the method at high densities, we use data-driven stochastic simulations [36].

Simulations setup

For linearly increasing crowd size in a football stadium (playfield) we simulate the localization data and apply the proposed method on it. We use the sensation dataset to derive the probability distributions of the uncertainties of the localization procedure that were discussed in “Problem statement” section. Concretely, our analysis of the dataset shows that (i) Median packet inter-arrival time for non-randomized MAC devices is 33 s at the most overloaded AP, and the time is exponentially distributed (Fig. 5); (ii) Mean distance (obtained by random sub-sampling) between the modes of the two Gaussians from “Problem statement” section for static devices (APs) is circa 20 m (see also Fig. 2a), and the standard deviations of the Gaussians are on average around 3 m, regardless of the crowd size; (iii) the average Gaussian error of the positioning process is also 3 m and the distribution of all errors is exponential.

Fig. 5
figure 5

Distribution of packet inter-arrival times for a non-randomized address

The crowd simulation is performed as follows. Every person moves in a zig–zag motion, in a random direction, because it has been discussed [35] that under high density visitors try to escape in a sort of zigzag motion, without any preference on direction, but only taking any free space nearby. The velocity is limited by the current crowd density via the Weidmann’s equation [7, 37]. We also introduce correlated positions: for crowd size of n, the number of groups is n/4, and every person is randomly assigned to a group (thus the average group size is 4).

After recording the original simulated positions for every second, we create the synthetic localization data. The “twins” or “teleportation through time” effect is introduced by displacing the positions randomly, following the distributions in (ii) above. The positioning errors are sampled following (iii). The packets (and thus location fits) are sub-sampled randomly with inter-arrival time drawn from an exponential distribution with a median 33 s, following (i) above. Each MAC address is randomized with a probability of 0.15, following the results in “Estimation under MAC address randomization” section. The randomized addresses change their MAC values in every sent packet.

Remark 4

We opted to implement our own crowd simulation instead of using available crowd simulators or models, because the latter are built for different purposes. In other words, in crowd simulators it is challenging to implement high crowd density [35] and correlated groups of friends [38]; on the other hand, we are clearly not interested in the exact pattern of movement of the visitors, which is the focus of crowd simulators—any positive effect from simulating visitors movement on a fine scale is flattened by the uncertainties of circa 20 m and the signal sparseness introduced by the Wi-Fi data.

Deriving the optimal parameters values

Our method requires two parameters: the length of the time window, in which the location estimates for a device are counted towards its probability distribution, and the time-out, in terms of number of windows, for keeping the old distributions (‘memory’).

We argue that the value of the time window \(\Delta t\) should be a compromise between the following requirements: (1) having a window long enough such that we can expect to localize each present non-randomized MAC device at least once in it (which should suffice since we also have the memory parameter to make sure that we don’t miss devices) and (2) having a window that is not too long, because of the crowd mobility. In order to estimate \(\Delta t\), we combine (1) the median packet inter-arrival time per non-randomized MAC address at the most overloaded access point, which from our data is 33 s, and (2) the results from [19], where the author concludes that, while in probing mode, on average smart phones send probing requests 55 times per hour. We choose as intuitively good value for \(\Delta t\) to be roughly 40 s. For high crowd density this value is still small enough to not cause outdated positioning. For example, if the crowd density is 4 p/m\(^2\), the maximum distance that a person can travel in 40 s according to the Weidmann’s equation is 8 m [7], which is acceptable given the localization uncertainties. If the crowd density is 5 p/m\(^2\), the maximum distance that a person can travel in 40 s is 2 m.

Having chosen a window size of 40 s, we need to decide how often to update the windows. In practice this would depend on the available computing infrastructure and how often one would want to update the probability distributions. For our experiments we chose to have slightly overlapping windows. The overlap is 10 s and thus the ‘timestep’ or ‘stride’ is in this case \(40-10=30\) s. Thus, probability distributions are updated every 30 s.

Considering the memory parameter, our experiments show that it depends on the average crowd density, and ranges from 0 windows backwards for expected average density < 1 p/m\(^2\), to 5 windows for densities > 4 p/m\(^2\). Intuitively this is justified by the fact that a crowd of low density can move faster and old distributions become soon outdated, while a dense crowd moves slowly and having higher memory enables not missing any visitors, for the reasons of being able to detect highly raised crowd density. On the other hand, it can be checked that the probability that a device will send two signals within 5 windows is greater than 99\(\%\) and it is thus not necessary to keep old estimates for longer. Figure 6 gives examples of the performances of our method w.r.t. the ground truth and fitted data for various values of the memory parameter (we will analyze the case memory \(=5\) in “Results from simulations analysis” section). The measurements are taken after 9 windows, to allow enough time for the system to ‘warm up’ [36]. The crowd density estimation in the case of fitted locations, where the fits are sparse, has been performed by making a snapshot of all devices that have been detected in the last 40 s. Note that the latter estimation already incorporates partly our method, regarding the window size; however we present it in the figure as it serves as a reference point.

Fig. 6
figure 6

Performances for full simulation, varying the ‘memory’ parameter

Results from simulations analysis

To see the advantage of applying our method instead of counting directly the fitted positions, we first perform the following experiment. We simulate a ‘static’ crowd with regular packet rates and thus regular fits (every second), and no MAC randomization, where by ‘static’ we mean that nobody moves. In other words, we simulate only the ‘teleportation’ effect introduced by the bi-modal distributions of localization through time. Our goal is to check the effect of creating spatial probability distributions in the stadium. We test the performances of our method w.r.t. simple counting of fitted positions in a 4 m \(\times\) 4 m corner of the stadium. Figure 7 shows the performance of our method versus simple counting for 10 independent simulations with increasing total crowd size, that ranges between 5000 and 50,000. Our method performs well w.r.t. simple counting of last fitted locations. This is due to the fact that our method ensures that all detected devices are in the stadium by re-normalizing the corresponding probability distributions. Next, as discussed in “Deriving the optimal parameters values” section, we apply our method with window size of 40 s, time step of 30 s and memory of 5 windows to independent full simulations as explained in “Simulations setup” section (with correlated crowd movement, packet delays and MAC randomization), increasing linearly the total crowd size in every simulation, from 2000 to 64,000, which is the capacity of modern stadiums. The results computed in a 4 m \(\times\) 4 m square in the middle of the playfield are shown in Figs. 8 and 9. To contrast the performance for high crowd densities to the performance for low crowd density, we include an example of the latter in Fig. 10, which shows that the method can be expected to under-perform for low crowd density. (Note that this estimation is under a worse case scenario, where everybody moves all the time with a normal walking speed). However, we are focused on having precise estimation under high densities. Figure 11 shows the performances through time of our method for a fairly dense crowd in the same 4 m \(\times\) 4 m region of the playfield. Again, for comparison we include also the plot of the fitted data in the last 40 s, noting that this calculation partially implements our method.

Fig. 7
figure 7

Performances for the ‘static’ crowd scenario

Fig. 8
figure 8

Full simulation, memory \(=5\)

Fig. 9
figure 9

Scatter plot of case in Fig. 8

Fig. 10
figure 10

Performance of our method for very low crowd density

Fig. 11
figure 11

An overview of performance through time for one simulation

Fig. 12
figure 12

Performance of our method under high percentage of randomized MAC addresses

Note that so far we were assuming that the percentage of randomized addresses is \(15\%\), as observed from our real data from the sensation concert. For completeness, we also check the crowd density estimation by our method in the same 4 m \(\times\) 4 m region when the percentage of randomized addresses is high. The results, after performing again independent full simulations with total crowd size between 2000 and 64,000, are shown in Fig. 12. We can see that, although the variance of the estimation is higher when the randomization is higher, there is still good agreement between the estimation and the true value.

Analysis under a university campus scenario

With simulations we explored worse case scenarios and showed the effectiveness of the proposed method. In this subsection we explore how the method behaves in normal conditions, with relatively low expected crowd density. We use the publicly available UJIIndoorLoc dataset [39] that has been designed for benchmarking fingerprinting methods for Wi-Fi positioning. The ground truth of the dataset is given by the GPS (Global Positioning System) locations of 25 devices worn by more than 20 participants. The participants moved across the Jaume university campus (Fig. 13). To generate positioning (“fitted”) data on which to apply our method, we use the software for Wi-Fi positioning based on RSS fingerprinting provided by [40, 41]. More concretely, we apply the affinity propagation method for clustering RSS fingerprints and the positioning method provided by [42] to generate the fits. To be able to have more test data and also because we are not interested in obtaining high precision fits, but rather to see the effect of our method applied to the latter, we use the small test dataset in [39] for training and the bigger train dataset for testing. (This is also in line with the reasoning in the design of the dataset in [41] where the training dataset is smaller than the testing dataset). We obtain fitted locations with an average error of 14 m. Note that this error is not of Gaussian type (it could be also due to the “twins” effect), and therefore we do not apply the kernel smoothing part of our method to the probability distributions. The memory is 0 in this low density case and the MAC randomization factor does not apply because this was a controlled experiment. Figure 14 shows the results of applying our method to the ‘fitted’ data. We calculated 433 windows with a time step of 30 s and window 40 s. The first window starts at 1,804,000 s after the first recorded timestamp in the dataset (in this period the number of detected devices was the highest). For counting people based on detected phones sending GPS locations, we also apply a window of 40 s, that is, we make a snapshot of the last positions of the devices detected in the last 40 s (because the GPS locations are also not regularly recorded).

Fig. 13
figure 13

The location in the Jaume University in which measurements were taken (bounded by a red rectangle)

Fig. 14
figure 14

Performance of our method w.r.t. ground truth provided by the GPS locations of phones

Analysis under a concert crowd scenario

To check how our method performs on the sensation dataset, we used video data posted online by visitors (security cameras are focused on detecting fire and are thus very sensitive to red light but not good enough for counting people). We have video frames available from the most interesting moment, towards the end of the concert, when the crowd size drops significantly in a short period of 15 min due to people leaving. From this material we extracted four time points from which counting could be done (see Fig. 15 for an overview).

Fig. 15
figure 15

Overview of video frames used for manually counting people (all time points are \(+\) 2:00 UTC at July 5, 2015). Videos and timestamps are courtesy of Jessey Helslijnen

Based on the frames selected from the video material, people are counted by manually clicking on their heads, using simple computer scripts to keep track of the number of mouse-clicks. Using multiple spatial reference points the people’s locations are projected from the perspective image to the 2D-coordinate system of the Wi-Fi measurements. We compare the manual people counts to our Wi-Fi based estimates on a 122.5 m\(^2\) region R that is an intersection of the region covered by Wi-Fi and the region covered by video (the upper triangle in the 15 m \(\times\) 15 m\(^2\) of Fig. 16). The region R is located on the football field. At the center of the field the DJ stage was positioned. We used time window of 40 s and memory 0 as discussed in “Deriving the optimal parameters values” section. The time step is 30 s. The results of estimation via the method and the video counts through time are given in Fig. 17.

Fig. 16
figure 16

Visualization of a manual people counts, and b the Wi-Fi-based crowd density estimate, both within the comparison region (upper triangle), at time point 05:32:04 (\(+\) 2:00 UTC) (see also Fig. 15 )

Fig. 17
figure 17

Performances of our method w.r.t. video based counting in a 112.5 m\(^2\) near the main stadium exit, last 15 min of the concert

Time complexity

An important practical implementation issue for an algorithm is its time complexity with respect to the input size. Note that our algorithm has a linear time complexity w.r.t. the crowd size, because we never consider people in relation to other people. More concretely, let us first recap the steps needed to compute the crowd density distribution: (1) compute the randomization factor; (2) discard the randomized addresses using the available flags; (3) for every remaining MAC device compute the individual probability distribution; (4) aggregate all m probability distributions and multiply the result by the randomization factor to derive the crowd density distribution. To compute the randomization factor, one needs to keep track of the percentage of randomized addresses detected in e.g. every minute of the last hour, i.e. to update a counter every time a new MAC address is detected in the last minute. Since the number of signals that arrive per minute is linear to the crowd size, this step has linear time complexity. To compute the individual probability distributions, one needs to keep track of the individual fits in the last time window. Note that the computation of the individual distribution does not depend on the other MAC devices, that is, this step is running in linear time as well, which also holds trivially for the last step.

Comparison to previous work

In “Introduction” we mentioned the benefits of using smart phones data over video based approaches for density estimation of indoor concert crowd; in fact, it is our opinion that the two approaches complement each other and in the future we plan to integrate both techniques in real time. Thus, in this section we contrast our method to previous work that estimates crowd density using wireless technologies.

A number of approaches [43,44,45] estimate crowd density based on variations of the received signal strength indicator (RSSI) values of a Wi-Fi network. Under controlled experiments they observe that the more people (obstacles) are present, the greater the RSSI variations. Fadhlullah and Ismail [43] apply analysis of variance, Yuan et al. [44] apply k-means clustering to obtain clusters of similar crowd density based on similar RSSI variations, while Yoshida et al. [45] apply linear and support vector machines regression. Our work also deals with the fact that greater crowd leads to greater RSSI variation (represented by the width of the rings in Fig. 2b). However, with our method the relative error of estimation of crowd density decreases as the (indoor) crowd increases, whereas the relative error of estimation of crowd density in the above approaches increases as the crowd increases. Having a decreasing error is essential for being able to detect dangerous crowd density.

Other approaches rely on an assumption that pedestrians move from point A to point B in order to estimate crowd density. More concretely, in Wirz et al. [7] the authors follow a participatory sensing approach in which pedestrians share their GPS locations on a voluntary basis. Since only a fraction of all pedestrians share location information, they infer the crowd density from the walking speed, based on the assumption that the maximal walking speed of pedestrians depends on the crowd density (and thus they assume that the crowd tends to reach a certain destination). A participatory sensing approach is also followed by Anzengruber et al. [8]. They use time series to predict mobility patterns in crowds of spectators, and related to the event agenda over time. Schauer et al. [9] focus their attention to airport pedestrians, without requiring crowd participation, but exploiting the predetermined direction of movement of passengers. They count unique MAC devices detected with strong signals by two sensors (nodes) at both sides (public and security) of a security check inside a major airport, to estimate pedestrian densities and pedestrian flow. Versichele et al. [10] use Bluetooth scanners at strategic locations during 10-day festivities, to analyze spatio-temporal dynamics of pedestrians. Their methodology is proximity-based, which is suitable under mobility assumption. Delafontaine et al. [11] apply sequence alignment methods for the extraction of behavioural patterns within Bluetooth tracking data. Higuchi et al. [46] use a participation based approach to estimate the number of people and make advantage of the fact that people move in groups to correct the estimations of individual traces.

Other approaches use additional devices distributed in the crowd to improve the density estimation. Weppner et al. [12, 47] estimate crowd densities by distributing volunteers in the crowd, who are carrying smart phones scanning for Bluetooth devices. The authors then use statistics to combine the different measurements in space and time. A recent non-participatory approach based on Wi-Fi fingerprinting is given by Tang et al. [13]. This approach proposes an online training phase (‘dynamic fingerprinting’) in addition to the offline training phase, which allows for the localization to use the current environmental settings to update the fingerprints database. This requires distributing Wi-Fi devices among the crowd to capture the latest RSS values, which however does not apply to our case of high density crowd in the playfield of a football stadium.

Other participation-based approaches include the following. Li et al. [48] use neural networks to learn the relationship between RSS and the number of people (similarly to using WiFi fingerprinting for estimating individual locations). Martella et al. [49] deploy a participation-based visitors positioning system in a museum. Some visitors wear sensor bracelets, but there are also sensors distributed across the museum. For every visitor a (discrete) probability distribution over all possible locations is maintained, after determining the most likely floor. The distribution is a weighted average of the neighboring sensors distributions, the weights corresponding to the signal strengths. Then the visitor is localized by computing the average of its spatial probability distribution. Note that, unlike the present paper, this approach implicitly assumes uni-modal spatial probability distribution. On another related note, the localization in [49] is more precise when there are more visitors wearing sensors. In our case data is collected from a fixed number of sensors; that is, our increase of precision with increased number of people is not related to increased number of sensors.

We summarize the added values of our method over previous work for estimating crowd density using wireless technologies as follows.

  1. 1.

    Our method does not require participation from the crowd;

  2. 2.

    It does not assume that visitors are moving in any particular direction;

  3. 3.

    It does not require extra hardware in addition to the available access points;

  4. 4.

    It does not assume existence of a global optimum in the localization procedure;

  5. 5.

    To the best of our knowledge, our method addresses for the first time the ambiguity problem in Wi-Fi localization in the context of crowd density estimation;

  6. 6.

    To the best of our knowledge, we address for the first time the MAC address randomization while estimating crowd size based on Wi-Fi technologies;

  7. 7.

    The estimation of crowd density with our approach increases with the increase of the crowd size (without relying on participation from the crowd), which is an essential property for being able to detect critically raised crowd density. To the best of our knowledge, this property has not been addressed or shown before.

Concluding remarks

We proposed a new method for estimating indoor crowd density based on wireless technologies. Our method does not rely on participation or mobility of the crowd but rather uses a big data analytics approach. We addressed three known challenges: (1) the ambiguity of localization procedure due to noisy RSS values, (2) the MAC address randomization when a device is in a probing mode, and (3) the irregularity of the packet interarrivial times. We used probabilistic models to address (1) and (2) and a memory-based model to address (3). We showed formally that the error of our estimation tends to zero as the crowd size increases (which is essential for enabling disaster prevention), even in case when the locations of visitors are correlated as in groups of friends. We used data-driven stochastic simulations to evaluate quantitatively the effectiveness of our method for highly dense crowds, and we used two datasets to evaluate the performances on real scenarios. Finally, we positioned our work in the context of previous related work and showed the added values of our approach. To the best of our knowledge, the approach presented here is the first to address the ambiguity and MAC randomization problems in context of no-participatory and no-mobility assumption, and also the first that has effectiveness at high crowd density as one of its design principles and results.

While we propose solutions to several issues related to estimating crowd density based on wireless technologies, it is important to emphasize that methods involving human behaviour cannot be all-encompassing. This is because of the complex and unpredictable nature of the human behaviour itself [50]. For example, in “Simulations setup” section we used the Weidmann’s equation to model the velocity of the people; however, it is known that this equation varies across cultures [7, 50]. In addition, the usage of Wi-Fi at public events is likely to change over time, affecting the method parameters. Thus, it is important to note that anytime a monitoring system is deployed at a different venue, a separate calibration process is necessary in principle [7]. Furthermore, one can also include the map of the venue in the calculations, so that the individual probability distributions can be re-normalized to cover only accessible regions.

In this paper we discuss estimating crowd density to be able to detect critical density and prevent crowd disasters. We note that planning optimal evacuation and navigation of the crowd are separate research challenges and we refer the reader to [3] for a recent overview. Concretely in our case the crowd can be navigated using the large TV screens already present at the stadium, or apps that use the built-in compasses of the smartphones.

Other possibilities for future work include: integrating Wi-Fi based crowd analysis with video-based analysis, and investigating the performances under edge crowd scenarios, like a crowd crush, and under different scenarios, like a crowd of pedestrians instead of concert visitors. A real-time implementation of our prototype model is also important future work.



radio-frequency identification


Media Access Control


General Data Protection Regulation


received signal strength


access point


probability density function


Global Positioning System


  1. Helbing D, Mukerji P. Crowd disasters as systemic failures: analysis of the love parade disaster. EPJ Data Sci. 2012;1(1):1–40.

    Article  Google Scholar 

  2. Oberhagemann D. Static and dynamic crowd densities at major public events. In: Technical report TB 13-01, Vereinigung zur Förderung des deutschen Brandschutzes. 2012.

  3. Ibrahim AM, Venkat I, Subramanian KG, Khader AT, Wilde PD. Intelligent evacuation management systems: a review. ACM Trans Intell Syst Technol. 2016;7(3):36–13627.

    Article  Google Scholar 

  4. Li T, Chang H, Wang M, Ni B, Hong R, Yan S. Crowded scene analysis: a survey. IEEE Trans Circ Syst Video Technol. 2015;25(3):367–86.

    Article  Google Scholar 

  5. Krausz B, Bauckhage C. Loveparade 2010: automatic video analysis of a crowd disaster. Comput Vision Image Understand. 2012;116(3):307–19.

    Article  Google Scholar 

  6. Karpagavalli P, Ramprasad AV. Estimating the density of the people and counting the number of people in a crowd environment for human safety. In: Proceedings 2013 IEEE conference on communication and signal processing. 2013. p. 663–7.

  7. Wirz M, Franke T, Roggen D, Mitleton-Kelly E, Lukowicz P, Tröster G. Probing crowd density through smartphones in city-scale mass gatherings. EPJ Data Sci. 2013;2:5.

    Article  Google Scholar 

  8. Anzengruber B, Pianini D, Nieminen J, Ferscha A. Predicting social density in mass events to prevent crowd disasters. In: Proceedings 5th conference on social informatics (SocInfo 2013). 2013. p. 206–15.

  9. Schauer L, Werner M, Marcus P. Estimating crowd densities and pedestrian flows using wi-fi and bluetooth. In: Proceedings 11th confernce on mobile and ubiquitous systems: computing, networking and services (MOBIQUITOUS 2014). 2014.

  10. Versichele M, Neutens T, Delafontaine M, de Weghe NV. The use of bluetooth for analysing spatiotemporal dynamics of human movement at mass events: a case study of the ghent festivities. Appl Geogr. 2012;32:208–20.

    Article  Google Scholar 

  11. Delafontaine M, Versichele M, Neutens T, de Weghe NV. Analysing spatiotemporal sequences in bluetooth tracking data. Appl Geogr. 2012;34:659–68.

    Article  Google Scholar 

  12. Weppner J, Lukowicz P, Blanke U, Tröster G. Participatory bluetooth scans serving as urban crowd probes. IEEE Sens J. 2014;14(12):4196–206.

    Article  Google Scholar 

  13. Tang X, Xiao B, Li K. Indoor crowd density estimation through mobile smartphone wi-fi probes. In: IEEE transactions on systems, man, and cybernetics: systems. 2018. p. 1–12.

  14. Sun W, Liu J, Wu C, Yang Z, Zhang X, Liu Y. Moloc: on distinguishing fingerprint twins. In: 2013 IEEE 33rd international conference on distributed computing systems. 2013. p. 226–35.

  15. Liu H, Gan Y, Yang J, Sidhom S, Wang Y, Chen Y, Ye F. Push the limit of wifi based localization for smartphones. In: Proceedings 18th conference on mobile computing and networking (Mobicom ’12). 2012. p. 305–16.

  16. Kannan AA, Fidan B, Mao G. Analysis of flip ambiguities for robust sensor network localization. IEEE Trans Vehicular Technol. 2010;59(4):2057–70.

    Article  Google Scholar 

  17. Akcan H, Evrendilek C. Reducing the number of flips in trilateration with noisy range measurements. In: Proceeding the 12th ACM workshop on data engineering for wireless and mobile access (MobiDE ’13). 2013. p. 20–7.

  18. Martin J, Mayberry T, Donahue C, Foppe L, Brown L, Riggins C, Rye EC, Brown D. A study of MAC address randomization in mobile devices and when it fails. Proc Privacy Enhancing Technol. 2017;4:365–83.

    Article  Google Scholar 

  19. Freudiger J. How talkative is your mobile device?: an experimental study of wi-fi probe requests. In: Proceeding 8th ACM conferencs on security & privacy in wireless and mobile networks (WiSec ’15). 2015. p. 8–186.

  20. Translation HJL. Personal Data Protection Act. Accessed 12 Dec 2018.

  21. Georgievska S, Rutten P. ArenA-Crowds. The Netherlands eScience Center 2018.

  22. Contributors W. Wi-fi positioning system. Wikipedia. 2018. Accessed 4 Dec 2018.

  23. Kushki A, Plataniotis KN, Venetsanoploulos AN. WLAN positioning systems: principles and applications in location-based services. New York: Cambridge University Press; 2012.

    Google Scholar 

  24. Bevington PR, Robinson DK. Data reduction and error analysis for the physical sciences. 3rd ed. Boston: McGraw-Hill; 2003.

    Google Scholar 

  25. Amoraal J. Thou customers, where are thou: KPMG’s Indoor. From simulation to implementation. TopQuants: Wi-Fi Tracking. 2014.

  26. Bahl P, Padmanabhan VN. RADAR: an in-building RF-based user location and tracking system. In: Proceeding IEEE conference on computer communications (INFOCOM 2000). 2000. p. 775–84.

  27. He S, Chan S-G. Wi-fi fingerprint-based indoor positioning: recent advances and comparisons. IEEE Commun Surv Tutorials. 2016;18(1):466–90.

    Article  Google Scholar 

  28. Yang Z, Liu Y, Li X. Beyond trilateration: on the localizability of wireless ad hoc networks. IEEE/ACM Trans Netw. 2010;18(6):1806–14.

    Article  Google Scholar 

  29. Contributors W. The law of large numbers. Wikipedia. 2018. Accessed 4 Dec 2018.

  30. Moussaïd M, Perozo N, Garnier S, Helbing D, Theraulaz G. The walking behaviour of pedestrian social groups and its impact on crowd dynamics. PLoS ONE. 2010;5(4):e10047.

    Article  Google Scholar 

  31. Mallah JE, Carrino F, Khaled OA, Mugellini E. Crowd monitoring. In: Proceeding third conference on distributed, ambient, and pervasive interactions (DAPI 2015). 2015. p. 496–505.

  32. Scott DW. Multivariate density estimation: theory, practice, and visualization. 2nd ed. New York: Wiley Series in Probability and Statistics; 2015.

    MATH  Google Scholar 

  33. Silverman BW. Density estimation for statistics and data analysis. London: Chapman and Hall, London; 1986.

    Book  MATH  Google Scholar 

  34. Vanhoef M, Matte C, Cunche M, Cardoso LS, Piessens F. Why mac address randomization is not enough: an analysis of wi-fi network discovery mechanisms. In: Proceedings the 11th ACM on Asia confernece on computer and communications security (ASIA CCS ’16). 2016. p. 413–24.

  35. Feliciani C, Nishinari K. An improved cellular automata model to simulate the behavior of high density crowd and validation by experimental data. Physica A. 2016;451:135–48.

    Article  Google Scholar 

  36. Banks J, Carson JS, Nelson BL, Nicol DM. Discrete-event system simulation, 5th edn. 2009.

  37. Weidmann U. Transporttechnik der Fussgänger: transporttechnische Eigenschaften des Fussgängerverkehrs (Literaturauswertung). Zürich: IVT Zürich; 1992.

    Google Scholar 

  38. Huang L, Gong J, Li W, Xu T, Shen S, Liang J, Feng Q, Zhang D, Sun J. Social force model-based group behavior simulation in virtual geographic environments. ISPRS Int J Geo-Inform. 2018;7:2.

    Google Scholar 

  39. Torres-Sospedra J, Montoliu R, Martínez-Usó A, Avariento JP, Arnau TJ, Benedito-Bordonau M, Huerta J. Ujiindoorloc: A new multi-building and multi-floor database for wlan fingerprint-based indoor localization problems. In: 2014 international conference on indoor positioning and indoor navigation (IPIN). 2014. p. 261–70.

  40. Lohan ES, Torres-Sospedra J, Richter P, Leppäkoski H, Huerta J, Cramariuc A. “crowdsourced wifi fingerprinting database and benchmark software for indoor positioning.

  41. Lohan ES, Torres-Sospedra J, Leppäkoski H, Richter P, Peng Z, Huerta J. Wi-fi crowdsourced fingerprinting dataset for indoor positioning. Data. 2017;2:4.

    Article  Google Scholar 

  42. Cramariuc A, Huttunen H, Lohan ES. Clustering benefits in mobile-centric wifi positioning in multi-floor buildings. In: Proceeding 2016 conference on Localization and GNSS (ICL-GNSS). 2016. p. 1–6.

  43. Fadhlullah SY, Ismail W. A statistical approach in designing an RF-based human crowd density estimation system. Int J Distribut Sens Netw. 2016;12(2):8351017.

    Article  Google Scholar 

  44. Yuan Y, Qiu C, Xi W, Zhao J. Crowd density estimation using wireless sensor networks. In: 2011 seventh international conference on mobile Ad-hoc and sensor networks. 2011. p. 138–45.

  45. Yoshida T, Taniguchi Y. Estimating the number of people using existing wifi access point based on support vector regression. Information. 2016;19(7):2661–8.

    Google Scholar 

  46. Higuchi T, Yamaguchi H, Higashino T. Context-supported local crowd mapping via collaborative sensing with mobile phones. Pervasive Mobile Comput. 2014;13:26–51.

    Article  Google Scholar 

  47. Weppner J, Lukowicz P. Bluetooth based collaborative crowd density estimation with mobile phones. In: Proceeding IEEE conference on pervasive computing and communications (PerCom). 2013. p. 193–200.

  48. Li H, Chan ECL, Guo X, Xiao J, Wu K, Ni LM. Wi-counter: smartphone-based people counter using crowdsourced wi-fi signal data. IEEE Trans Hum Mach Syst. 2015;45(4):442–52.

    Article  Google Scholar 

  49. Martella C, Cattani M, van Steen M. Exploiting density to track human behavior in crowded environments. IEEE Commun Magaz. 2017;55:48–54.

    Article  Google Scholar 

  50. Helbing D, Johansson A. Pedestrian, crowd and evacuation dynamics. In: Encyclopedia of complexity and systems science. 2009.

Download references

Authors’ contributions

The authors contributed to the research and manuscript with the order they appear, the last author being also the project principal investigator. All authors discussed the final results as well as improved the final manuscript. In particular, SG drafted the main ideas of the proposed solution, and wrote/performed the work presented in “Introduction”, “Localization by fingerprinting”, “Problem statement”, “Estimation under localization ambiguity” (creating statistical ensembles), ““Conservation of mass” under packet rates volatility”, “Estimation under MAC address randomization”, “Theoretical analysis under correlated groups”, “Analysis with simulations”, “Analysis under a university campus scenario”, “Concluding remarks” and see “Abbreviations” sections. PR implemented the proposed method and wrote “Localization by multilateration”, “Estimation under localization ambiguity” sections (computing individual probability distributions) and, jointly with SG, “Analysis under a concert crowd scenario”. JA wrote “Data collection and privacy protection” section. RB and BdV performed experiments and discussed the simulation strategy jointly with SG. All authors actively participated in discussions also in the research phase. All authors read and approved the final manuscript.


We thank the Amsterdam ArenA (presently Johan Cruijff) Innovation Center for their support. We are very thankful to Gijs van den Oord for reviewing the manuscript and for his insights that led to improvement of the manuscript. We are very thankful to Jessey Helslijnen, for his dedication to help with the video data and his unconditional enthusiasm for contributing to science.

Competing interests

The authors declare that they have no competing interests.

Availability of data and materials

By the Dutch law for privacy protection [20] we are not allowed to reveal data from sensation about the individual devices (even if they are anonymized), but we are allowed to reveal aggregated results about the crowd. The dataset used in “Analysis under a university campus scenario” section is available at All software (in Python) related to the research results presented in this paper can be found in [21] ( Latest version of the software is available at:

Consent for publication

Not applicable.

Ethics approval and consent to participate

Not applicable.


The project has been funded by the Netherlands eScience Center.

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Author information

Authors and Affiliations


Corresponding author

Correspondence to Sonja Georgievska.

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (, which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Georgievska, S., Rutten, P., Amoraal, J. et al. Detecting high indoor crowd density with Wi-Fi localization: a statistical mechanics approach. J Big Data 6, 31 (2019).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: