Like other application areas to which big data analytics and context-aware computing as advanced strands of ICT of the new wave of computing are applied, smart sustainable cities require these two related digital ecosystems and their components to be put in place, spanning different spatial scales in the form of enabling technologies necessary for designing, developing, deploying, and implementing the diverse applications that support, and ideally integrate, the dimensions of urban sustainability. As scientific and technological areas, these two strands involve low-level data collection, intermediate-level information processing, and high-level application action and service delivery (e.g. [19]). Worth noting is that as a result of the ongoing effort to realize and deploy smart sustainable cities, which are evolving due to the advance and prevalence of the enabling technologies of ICT of the new wave of computing, all the three areas are under vigorous investigation in the creation of urban environments merging the informational and physical landscapes of such cities for advancing sustainability.
There are many permutations of the core enabling technologies underlying big data analytics and context-aware computing. However, they all pertain to ICT of the new wave of computing, an integration of UbiComp, AmI, the IoT, and SenComp, which will in the near future be the dominant mode of monitoring, understanding, analyzing, and planning smart sustainable cities to improve sustainability [5, 6]. It is worth iterating that both big data analytics and context-aware computing share the same core enabling technologies because they are an integral part of ICT of the new wave of computing, as we will elucidate below. As such, they involve unobtrusive and ubiquitous sensing technologies and networks, sophisticated data management and analysis approaches, data processing platforms, cloud computing and middleware infrastructures, and advanced wireless communication technologies. These are to provide solutions in the form of useful and context knowledge for the purpose of achieving the required level of sustainability in the context of smart sustainable cities. Moreover, to have effective and successful solutions on the basis of core enabling technologies, it is required to select a number of design and development priorities in a planned manner prior to any deployment and implementation. For example, it is essential to consider flexible design, quick deployment, extensible implementation, more comprehensive interconnections, and more intelligence (e.g. [73]). However, while most of the core enabling technologies are general and apply to many application domains, others remain specific to the urban application domain, specifically to the special requirements and objectives of smart sustainable cities.
Pervasive sensing for urban sustainability
Collecting and measuring urban big data
In the emerging field of smart sustainable urban planning (e.g. [5, 6]), many scholars in different disciplines and practitioners in different professional domains advocate particularly the inclusion of ubiquitous sensing. Sensor ubiquity is a core feature of smart sustainable cities of the future, which rely on the fulfillment of the prevalent ICT visions of pervasive computing. Within the next 15 years or so, most of the data that will be used to monitor, understand, analyze, and plan the systems of smart sustainable cities will come from digital sensing of observations, transactions, and movements associated with the operating and organizing processes of urban life, which can provide readings on many environmental, social, economic, and physical phenomena. These data will be available in various forms, with temporal tags and geotags, coupled with a variety of data mining methods and data visualization techniques for displaying and presenting patterns and correlations. A large number of methods for collecting and capturing urban big data from new varieties of digital access are being fashioned and deployed across urban environments. Examples of digital access include the satellite-enabled GPS in vehicles and on citizens, traces left from online transactions processing and related demand-supply situations, online interactions (e.g. social media sites), numerous kinds of web sites, and online interactive data systems pertaining to crowd-sourcing. Satellite remote-sensing data are also becoming widely deployed, in addition to a variety of scanning technologies associated with the IoT. The convergence of these phenomena are increasingly paving the way for big data analytics (and context-aware computing) to become the dominant mode of urban analytics in relation to urban operational functioning and planning, as well as for exploiting and extending a variety of data mining and machine learning techniques through which the generation of models will be essential in a wide range of engineering solutions for advancing urban sustainability, i.e. improving the contribution of smart sustainable cities to the goals of sustainable development. Such cities are to be monitored, understood, analyzed, and planned across several spatial levels mostly on the basis of data routinely and automatically collected by sensors. With the flourishing smart sustainable urban planning approach (e.g. [2, 5]), pervasive sensing is gaining increased momentum and prevalence as to measuring and collecting data on urban functioning and change in a new way, from the ground up, by means of powerful sensing technologies (motion, behavior, orientation, location, etc.). At present, for instance, sensing urban change from the ground up occurs ‘through new sensing technologies that depend on hand-held and remote devices through to assembling transactional data from online transactions processing which measure how individuals and groups expend energy, use information, and interact’ ([2], p. 492) with respect to resources. Linking and meshing data from various types of sophisticated measuring devices (RFID, NFC, GPS, laser scanners, etc.) with the automation of standard secondary sources of data and unconventional data no doubt provides a rich nexus of possibilities as to providing new and open sources of data necessary for monitoring and understanding how smart sustainable cities will function in a more effective and efficient way.
At present, the urban environment is pervaded by huge quantities of active devices of diverse kinds and forms to particularly automate routine decisions. The fabric of smart sustainable cities is expected to be, arguably, enveloped with an electronic skin, which can be sewed together and entrenched with even more advanced embedded measuring devices, information processing systems, and communication networks. These include countless intelligent sensing and computing devices and related sophisticated and dedicated techniques and algorithms, as well as widespread diffusion of wirelessly ad-hoc, mobile network infrastructures and related protocols. The primary aim is to build an entirely new holistic system which supports the following:
-
The acquisition and coordination of data from multiple distributed sources.
-
The management and organization of data streams.
-
The integration of heterogeneous data into coherent databases and their warehousing.
-
The preprocessing and transformation of data.
-
The management and seamless composition of extracted models and patterns respectively.
-
The evaluation of the quality of the extracted models and patterns.
-
The visualization and exploration of behavioral patterns and models.
-
The simulation of the mined patterns and models.
-
The deployment of the obtained results for decision support and efficient service provision.
Regardless of their scales, new sensing and computing devices are projected to be equipped with quantum-based processing capacity, unlimited memory size, and high performance communication capabilities, all linked by mammoth bandwidth and wireless (internet) connectivity as well as middleware architectures connecting several kinds of distributed, heterogeneous hardware systems and software applications [19]. All of the above is to be directed for advancing the contribution of smart sustainable cities to the goals of sustainable development. Explicitly, future urban ICT driven by the new wave of computing will result in a blend of advanced applications, services, and computational (data) analytics enabled by constellations of instruments across several spatial scales linked via multiple networks, which can provide a fertile environment conducive to monitoring, understanding, analyzing, evaluating, and planning the sustainability of future cities.
Recent advances in sensor technology have given rise to a new class of miniaturized devices characterized by advanced signal processing methods, high performance, multi-fusion techniques, and high-speed electronic circuits. The trends toward ICT of the new wave of computing, coupled with the evolving concept of smart sustainable cities, are driving research into ever-smaller sizes of sensors capable of powerfully sensing complex and varied aspects of urban life and environment at very low cost. The production of sensing devices with a low cost-to-performance ratio is further driven by the rapid development of sensor manufacturing technologies (e.g. [19]). The increasing miniaturization of computer technology is making it possible to develop miniature on-body and remote sensors that allow registering various human and urban parameters without disturbing citizens or interfering with urban activities, thereby the commonsensical infiltration of sensors into daily urban life and environment. This is instrumental in enhancing the computational understanding and data processing of human mobility, urban dynamic processes, and urban operational functioning, a process that entails analysis, interpretation, modeling, and evaluation of big data for enhanced decision-making and deep insights. The new wave of urban computing is about the omnipresence of invisible technology in urban environments and thus citizens’ everyday life. Countless tiny, distributed, networked sensor devices will be invisibly embedded in cities for data collection. The research in the area of micro- and nano-engineering [74] is expected to yield major shifts in ICT performance and the way mechatronic components and devices are manufactured, designed, modeled, and implemented, thereby radically changing the nature and structure of sensing devices and thus the way cities will be monitored, understood, analyzed, probed, and planned in the near future.
Sensor-based urban sustainability mining
As part of urban reality mining (e.g. [2, 75]), urban sustainability mining, which pertains to sensing complex environmental and socio-economic systems by means of ubiquitous sensors embedded throughout urban environments, is a key determinant of how cities developing and responding to the challenge of sustainability are becoming smarter. Mining of urban sustainability depends on dedicated, powerful software applications to log urban infrastructures, spatial organizations and interactions, and mobility and travel behavior as well as ecosystem and public services. The analysis of derived large datasets helps to extract computationally complex activity, behavior, process, and environment models to identify and gain predictive insights into new forms, structures, systems, and processes as to how smart sustainable cities can increase their contribution to sustainability through enhancing urban intelligence functions for decision-making in this regard. Therefore, sensor-based big data have enormous potential to gain new insights into and drive decisions about how sustainability can be better translated into the built, infrastructural, operational, and functional forms of smart sustainable cities across several spatial scales. Further studies in this direction are most likely to enhance mobility, transport engineering, energy engineering, planning, spatial and physical structures, and data-driven characterization of urban functioning in the context of sustainability.
Sensor technologies in context-aware computing
Sensor types and sensing areas in context-aware applications
As with big data analytics, context-aware computing involves a wide variety of sensors. A sensor can be described as a device that detects or measures a physical property or some type of input from the physical environment, and then indicates or reacts to it in a particular way (e.g. [19]). The output is a signal in the form of human-readable display at the sensor location or a recorded data that can be transmitted over a network for further processing. Commonly, sensors can be classified according to the type of energy they detect as signals, and include, but are not limited to, the following types:
-
Location sensors (e.g. GPS, active badges).
-
Optical/vision sensors (e.g. photo-diode, color sensor, IR and UV sensor).
-
Light sensors (e.g. photocells, photodiodes).
-
Image sensor (e.g. stereo-type camera, infrared).
-
Sound sensors (e.g. microphones).
-
Temperature sensors (e.g. thermometers).
-
Heat sensors (e.g. bolometer).
-
Electrical sensors (e.g. galvanometer).
-
Pressure sensors (e.g. barometer, pressure gauges).
-
Motion sensors (e.g. radar gun, speedometer, mercury switches, tachometer).
-
Orientation sensors (e.g. gyroscope).
-
Physical movement sensors (e.g. accelerometers).
-
Biosensors (e.g. pulse, galvanic skin response measure).
-
Vital sign processing devices (heart rate, temperature).
-
Wearable sensors (e.g. accelerometers, gyroscopes, magnetometers).
-
Identification and traceability sensors (e.g. RFID, NFC).
While there are different ways of sensing that could be utilized for detecting various features of context, in the realm of smart sustainable cities not all the above are of use in relation to context-aware applications in terms of optimization, control, management, operation, and service delivery associated with sustainability dimensions. How many and what types of sensors can be used in relation to a given context-aware application is determined by the way in which context is operationalized (defined so that it can be technically measured and thus conceptualized) in terms of the number of the entities of context that are to be incorporated in the system based on the application domain, and also whether and how these entities can be combined to generate a high-level abstraction of context (e.g. the physical, situational, behavioral, and social dimension of context). Too often, in relation to both citizens and urban systems, various kinds of sensors are used to detect context.
Acquisition of sensor data about citizens and urban systems (energy, traffic, transport, mobility, etc.) and their behavior and functioning is an important factor in addition to the knowledge domain for analysis of such data by data processing units. In relation to context-aware applications pertaining to citizens, data can be generated from multiple sources, including software equivalents in relation to citizens’ devices, such as smartphones, computers, laptops, and other everyday objects. In other words, data are collected and captured from a variety of digital sensors as well as online interactive applications. Observed information about the citizen and urban system’ states or situations in conjunction with the dynamic models for the citizen and system’ relevant processes serve as input for the process of computational understanding. This entails the analysis and estimation of what is going on in the surrounding environment in the context of smart sustainable cities. Accordingly, for a context-aware application or system to be able to infer high-level context abstraction based on the interpretation of and reasoning on context information, it is first necessary to acquire low-level data from physical sensors (and other sources). Researchers from different application domains within the field of context-aware computing have investigated context recognition for the past 2 decade or so by developing a diversity of sensing devices (in addition to methods and techniques for signal and data processing, pattern recognition, modeling, and reasoning tasks). Thus, numerous types of sensors are currently being used to detect various attributes of context.
Multi-sensor data fusion and its application in context-aware applications and systems
In context-aware computing, underlying the multi-sensor fusion methodology is the idea that an abstraction of context as an amalgam of different, interrelated contextual elements can be generated or inferred on the basis of information detected from multiple, heterogeneous data sources, which provide different, yet related, sensor information. Thus, sensors should be integrated to yield optimal context recognition results, i.e. provide robust estimation of context. A given dimension of the context, a higher level of the context, can be deduced by using a number of external or internal contexts as an atomic level of the context. Figure 1 illustrates multisensor fusion for context awareness.
The use of multi-sensor fusion approach in context-aware applications and systems allows gaining access simultaneously to varied information necessary for accurate estimation or inference of context. Multi-sensor fusion systems have the potential to enhance the information gain while keeping the overall bandwidth low [19]. Figure 1 illustrates a multi-sensor fusion approach.
Wireless communication network technologies and smart network infrastructures
In the context of smart sustainable cities, wireless solutions are set to proliferate in ways that are hard to imagine, as ICT continues to be fast embedded and interwoven into the very fabric of current smart and sustainable cities in terms of their systems and processes in an increasingly computerized urban society. This is a future world of pervasive computing infrastructures and communication networks. Countless sensors will use various wirelessly ad-hoc and mobile networks to provide cities with all kinds of data necessary for a wide variety of applications and services. In particular, the widespread diffusion of wireless network technologies will, as a by-product of their normal operations, enable to sense, collect, and coordinate massive repositories of spatiotemporal data pertaining to urban systems, which represent city-wide proxies for all kinds of activities and operating and organizing processes.
Also, smart networks are necessary for big data applications in terms of connecting the components and entities of smart sustainable cities, including diverse citizens’ everyday objects (computers, smart phones, cars, house devices, etc.) and city infrastructures and facilities as well as urban departments, authorities, and enterprises. Such networks are intended to provide efficient means for transferring the collected data from heterogeneous and distributed sources to data warehouses where big data are to be stored, coalesced, organized, and integrated for processing and analysis in connection with intelligent decision support systems. This involves transferring responses back to the different citizens’ devices and urban entities’ systems for the purpose of improving different aspects of sustainability.
In relation to ICT of the new wave of computing, networking is a core enabling technology, in addition to cheap, low-power sensing and computing devices. In this context, the role of networking lies in tying hardware and software systems all together for the functioning of ubiquitous applications and services in urban areas, to draw on Bibri [19]. Accordingly, many heterogeneous components and devices across dispersed infrastructures and disparate networks need to interconnect as part of vast architectures enabling big data analytics, context-aware computing, intelligence functions, and service provisioning on a hard-to-imagine scale [19]. To put it differently, wireless network technologies are prerequisite for coordinating data as well as linking up many diverse distributed sensing devices and computing components and enabling them to interact in the midst of a variety of hardware and software systems necessary for realizing smart urban environments for advancing sustainability. Wireless technologies, especially satellite-enabled GPS, Wi-Fi, and mobile phone networks, enable to sense, collect, and coordinate massive environmental and socio-economic data representing enormous proxies for the operations, functions, and services of smart sustainable cities and thus powerful physical-environmental and socio-behavioral microscopes (e.g. [6]). This may facilitate, by means of big data analytics (data mining and database integration capabilities) which offer the prospect for adding value in terms of massive data analysis and integration, discovering the hidden patterns, correlations, and models that characterize, on the one hand, human mobility and movement as part of daily trajectories and activities of citizens and, on the other hand, physical structures and spatial organizations, which can be instrumental in strategic decision-making associated with urban sustainability planning (see [6]). In all, while pervasive sensing and computing infrastructures allow for monitoring, understanding, and analyzing urban life in terms of infrastructure, built form, administration, and ecosystem and human services, pervasive networking infrastructures allow for collecting and coordinating extensive data in terms of how these data are stored, made accessible, and utilized.
In the context of smart sustainable cities, advanced digital networks are crucial to urban operational functioning and planning due to the interrelationships between urban components and domains that are too many to catalogue (transport, mobility, communication, building, energy, environment, water, waste, land use, healthcare, etc.). These are planned to be further heavily networked while the activities relating to these domains to be linked up. The key domains ‘which currently are being heavily networked involve: transport systems of all modes in terms of operation, coordination, timetabling, utilities networks which are being enabled using smart metering, local weather, pollution levels and waste disposal, land and planning applications, building technologies in terms of energy and materials, health information systems in terms of access to facilities by patients the list is endless. The point is that we urgently need a map of this terrain so that we can connect up these diverse activities’ ([2], p. 493). Especially, the evolving techno-urban contexts are opening spaces for smart sustainable initiatives in domain networking at current times of tension as alternative trajectories are actively being sought due to the challenge of sustainability, which entails creating innovative solutions that further facilitate collaboration among urban domains and hence integrate urban systems.
In parallel, the aim of emerging technological platforms such as UbiComp, AmI, the IoT, and SenComp is to orchestrate and coordinate the various computational entities in the informational landscape of smart sustainable cities and merging it with their physical landscape into an open system that helps diverse urban entities cope with and plan their activities in relation to improving sustainability. Besides, the growing depth, scale, and complexity of urban networks in terms of both domains and technological infrastructures call for developing and coordinating such networks and enhancing their digital capabilities in ways that increase and sustain the contribution of smart sustainable cities to the goals of sustainable development. Advanced wireless technologies are extremely placed to initiate this development and coordination. Moreover, with their ever-growing volume, variety, velocity, and timeliness, data on the state of urban networks as built artifacts as well as on that of their use as part of urban activities and processes provide enormous potential to improve urban operational functioning and planning (see, e.g. [1, 2]) in terms of sustainability, efficiency, and the quality of life by exploiting the analytical power of big data for deep insights and enhanced decision-making. To effectively use these data when implementing big data applications in smart sustainable cities requires fostering these data by advanced wireless technologies, especially in relation to real-time applications. The rationale is that such applications entail that the data from distributed sources should be aggregated and fused prior to being transferred in real-time to cloud computing infrastructures or data processing platforms for stream processing and decision-making. Important to note is that the aggregation and fusion should be carried out in ways that enable data to remain reliable, accurate, and correct for more effective results and thus beneficial knowledge in terms of decision-making processes. This is in turn of critical importance for maintaining the quality and performance of real-time big data applications in terms of decision-making processes [76].
Data processing platforms for big data analytics
There is a variety of available data processing platforms for big data analytics, which provide the stream processing required by real-time big data applications in relation to various urban domains. Therefore, data processing platforms are a key component of the ICT infrastructure of smart sustainable cities of the future with respect to big data applications. Among the leading platforms for big data storage, processing, and management include Hadoop MapReduce, IBM Infosphere Streams, Stratosphere, Spark, and NoSQL-database system management (e.g. [1, 28, 53, 60, 62, 63]). These platforms work well on cluster systems to meet the requirements of big data applications for smart sustainable cities; entail scalable, evolvable, optimizable, and reliable software and hardware components; and provide high performance computational and analytical capabilities (namely selection, preprocessing, transformation, mining, evaluation, interpretation, and visualization), in addition to storage, coordination, and management of large datasets across distributed environments. As ecosystems, they perform big data data analytics related to a wide variety of large-scale applications intended for different uses associated with the process of sustainable urban development, such as management, control, optimization, assessment, and improvement, thereby spanning a variety of urban domains and subdomains. In all, they are prerequisite for data-centric applications for smart sustainable cities of the future. The focus on Hadoop MapReduce is justified by the suitability of its functionalities as to handling urban data as well as to its advantages associated with load balancing, cost effectiveness, flexibility, and processing power compared to other data processing platforms. Hadoop MapReduce has become the primary big data storage and processing system given its simplicity, scalability, and fine-grain fault tolerance [59]. For example, it is capable of handling all data types collected from multiple sources to derive actionable insights. However, it does pose issues regarding processing efficiency, rigid data flow, and low-level abstraction. NoSQL (e.g. Mongo DB and Cassandra) is also fast becoming a choice for storing and sorting structured and unstructured data and cluttering them with greater efficiency and scalability.
Cloud computing for big data analytics: characteristic features and benefits
Big data analytics can also be performed in the cloud. This involves both big data platform as a service (PaaS) and infrastructure as a service (IaaS) (e.g. [77]). Having attracted attention and gained popularity worldwide, cloud computing is becoming increasingly a key part of the ICT infrastructure of both smart cities and sustainable cities (e.g. [1, 5, 7, 53, 66, 67, 71]) as an extension of distributed and grid computing due to the prevalence of sensor technologies, storage facilities, pervasive computing infrastructures, and wireless communication networks. Especially, most of these technologies have become technically mature and financially affordable by cloud providers. By commoditizing services, low cost open source software, and geographic distribution, cloud computing is becoming increasingly an attractive option [78].
Big data analytics is associated with cloud computing (e.g. [1, 77]; [79], an Internet-based computing model that is increasingly seen as the most suitable solution for highly resource intensive and collaborative applications as an on-demand network access to a shared pool of computing resources (memory capacity, energy, computational power, network bandwidth, interactivity, etc.) [1, 7, 80]. This entails that computer-processing resources, which reside in the cloud, are virtualized and dynamic, which implies that only display devices for information and services need to be physically present in relation to urban domains where diverse stakeholders (administrators, planners, landscape architects, sustainability strategists, authorities, citizens, etc.) can make use of software applications and services to improve sustainability. Such stakeholders can access cloud-based software applications through a web browser and a lean client (a computer program that depends on its server to fulfill its computational roles) or mobile devices while software tools and urban data of all kinds are stored on servers at a remote location. Indeed, cloud computing model is based on hosted services in the sense of application service provisioning running client server software locally. In this respect, smart sustainable city applications pertaining to transport, traffic, mobility, energy, public health, civil security, education, and so on reside ‘in the cloud’ and can be accessible per demand. Moreover, the software development platform can be offered in a public, private, or hybrid network, where the cloud provider manages the platform that runs the applications and relieves the cloud clients from the burden of securing dedicated platforms, which would otherwise be very demanding and costly in terms of resources and time. The cloud clients can accordingly benefit from tested, scalable, reliable, and maintainable platforms offered by the cloud provider. Another advantage involves service process optimization through advanced functionalities of software development platforms, namely flexibility, interoperability, reusability, scalability, and cooperation. There is also a great opportunity to slash or minimize energy consumption associated with the operation of ICT infrastructure, especially when it comes to large-scale deployments like in the case of smart sustainable cities as to different departments and service agencies. Beloglazov et al. [81] develop policies and algorithms that aim at increasing energy efficiency in cloud computing. Energy consumption is way too lower than if all urban entities have their own software development platforms. These are indeed shared by multiple users as well as dynamically reallocated per demand. This approach maximizes the use of computational power and reduces energy usage and thus mitigate GHG emissions associated otherwise with powering a variety of functions as well as data centers dispersed throughout the departments and service agencies of smart sustainable cities. Whether public or private, the cloud provider includes the cloud environment’s servers, storage, networking, and data center operations. This implies that the cloud provider has the actual energy-consuming computational resources; users or clients can simply log on to the network without installing anything, thereby curbing energy usage and making the best of the available computational power. Energy efficiency in cloud computing can result from energy-aware scheduling and server consolidation [82]. Mastelic et al. [83] provides a survey on energy efficiency in cloud computing. Also, cloud computing is seen as a form of green computing, especially if it is based on renewable energy like solar panels. It has other intuitive benefits because it relies on sharing of resources and maximizing the effectiveness of the shared resources, thereby reducing the costs otherwise incurred by ICT operations as to human, technical, and organizational resources. In cloud computing, supercomputers in large data centers as a distributed system of many servers are used to deliver services in a scalable manner as well as to enable the storage and processing of vast quantities of data. Cloud computing offers great opportunities for streamlining data processing [84]. In all, cloud computing constitutes an efficient and elegant solution in terms of facilitating the huge demand for computing resources associated with big data analytics for decision-making processes in relation to the operational functioning and planning of cities in terms of sustainability. Through the use of cloud computing, smart sustainable cities can accordingly have higher possibilities to perform more effectively and efficiently thanks to the advanced technological features underlying the functioning of cloud computing model.
In addition, cloud computing performs service-oriented computing. In this regard, it can rapidly process large and complex data produced from urban activities and simultaneously serve citizens in relation to healthcare, education, housing, utility, and so on, providing a kind of integrated and specialized center for information services to both the general public and urban departments across various urban domains. In light of this, with reference to smart sustainable cities, cloud computing has the ability to run smart applications on many connected computers and smartphones at the same time for different purposes associated with increasing sustainability performance.
In sum, among the key advantages provided by cloud computing technology include cost reduction, location and device independence, virtualization (sharing of servers and storage devices), multi-tenancy (sharing of costs across a large pool of cloud provider’s clients), scalability, performance, reliability, and maintenance. Therefore, opting for cloud computing to perform big data analytics in the realm of smart sustainable cities remains thus far the most suitable option for the operation of infrastructures, applications, and services whose functioning is contingent upon how urban domains interrelate and collaborate, how efficient they are, and to what extent they are scalable as to achieving and maintaining the required level of sustainability.
Middleware infrastructure for context-aware computing: characteristics and functions
Middleware infrastructure is associated with pervasive computing environments and distributed applications. These encompass UbiComp, AmI, and SenComp environments and applications. Middleware infrastructure (e.g. [44, 47, 48]) plays a key role in the functionalities of complex distributed applications, including context-aware applications. Thus, context-aware computing, which is associated with UbiComp, AmI, and SenComp, requires middleware infrastructure to operate. This infrastructure can also run on cloud computing [platform as a service (PaaS) and infrastructure as a service (IaaS)]—i.e. cloud middleware.
Middleware infrastructure represents the logic glue in a distributed computing system, as it connects and coordinates many components constituting distributed applications. This occurs, more specifically, ‘in the midst of a variety of heterogeneous hardware systems and software applications needed for realizing smart environments and their proper functioning. To put it differently, in order for the massively embedded, distributed, networked devices and systems, which are invisibly integrated into the environment, to coordinate require middleware components, architectures, and services. Middleware allows multiple processes running on various sensors, devices, computers, and networks to link up and interact to support (and maintain the operation of context-aware applications needed by citizens and urban entities to cope with and perform their) activities wherever and whenever needed.’ ([19], p. 50). Indeed, it is the ability of multiple, heterogenous hardware and software systems to cooperate, interconnect, and communicate seamlessly across disparate networks that create smart environments rather than just their ubiquitous presence and massive use. In the context of smart sustainable cities, such systems in their various forms (e.g. sensors, smartphones, computers, databases, data warehouses, application integration methods, application servers, web servers, context management systems, and messaging systems) are highly distributed, interoperable, and dynamic, involving a myriad of embedded devices and information processing units ‘whose numbers are set to increase by orders of magnitude and which are to be exploited in their full range to transparently provide services on a hard-to-imagine scale, regardless of time and place’ [19]. This in turn allows for the functioning of context-aware applications across the diverse domains of smart sustainable cities.
There are different approaches to conceptualizing middleware. According to Schmidt [46], middleware consists of the following four distinct layers based on their intended functionality:
-
(1)
Host-infrastructure middleware
-
(2)
Distribution middleware
-
(3)
Common middleware services
-
(4)
Domain-specific middleware services.
Another conceptualization of middleware entails a common multilayer architecture that provides particular functionalities and constitute the basis for upper layers of more abstraction. It includes the following components:
-
Infrastructure and communications (messaging services) pertaining to entities of the upper layer
-
Services and agents related to semantic descriptions
-
Middleware services concerned with the software environment
-
Intelligence associated with the coordination of application actions and involving a number of devices in the environment.
As regards to some of its characteristic features compared to cloud computing, middleware-based architectures entail reusable software infrastructure that resides between the application programs (in this case context-aware applications) and the underlying hardware and operating systems. That is to say, middleware sits between the kernel and applications. Incidentally, the functionality of network protocol stacks (TCP/IP) was previously provided separately by middleware, but nowadays is integrated in every operating system. Moreover, middleware simplifies and supports the development of complex distributed applications, using such tools as web servers, application servers, messaging systems, and content management systems. These applications collaborate with, or leverage services from, other disparate applications that are systematically tied using methods of application integration. In addition to handling the distribution and heterogeneity of computing resources associated with the logic of context-aware applications in this context, middleware is intended to bridge the gap between the applications and the underlying lower-level hardware and software infrastructure to ensure and boost coordination, cooperation, interconnection, dynamicity (e.g. sensors join and leave AmI infrastructure in a dynamic fashion), and interoperability of the different components of distributed applications (e.g. [19, 45, 85]). These functionalities are in fact necessary for supporting scalable systems as well as highly heterogeneous and distributed components, such as agents and services. In relation to this, middleware support and deploy data-centric distributed systems (e.g. network-monitoring systems, sensor networks, and dynamic web) whose ubiquity creates large application networks spreading over large geographical areas [19]. Especially, AmI and UbiComp infrastructures are highly dynamic and involve high degree of heterogeneity (e.g. [86]. As to interoperability, for instance, context-aware applications run on different operating systems, thereby the role of middleware in enabling interoperability between applications by supplying services for exchanging data in a standard way. Indeed, in the realm of context-aware computing, which entails distributed processing in the sense of multiple applications being connected to create larger applications over a network, middleware provides services beyond or more than those available from the operating system of these applications to enable the various elements of the underlying distributed system to communicate and manage data, thereby serving as a kind of a software glue. Therefore, distributed processing is empowered by middleware for transferring signals from various sources and for realizing information fusion from multiple perceptive components [44].
Middleware and cloud computing infrastructures differ in their technical details as to how they provide application services and which kind of services they are concerned with, as well as in the characteristic features of their operation and complexity. Yet, they denote computing models where machines in large data centers across distributed environments can be used to deliver a variety of services and meet the needs of different urban constituents in terms of the use of big data and context-aware applications for improving sustainability. Hence, both are prerequisite for in the operation of smart sustainable cities. This is anchored in the underlying assumption that big data and context-aware applications are an integral part of ICT of the new wave of computing, and smart sustainable cities typically rely on the fulfillment of its underlying visions.
Big data management
Given the volume, variety, and velocity characterizing big data, effective and suitable big data management tools are extremely important to ensure a useful utilization of big data in terms of analytics and the related results and inferences. Accordingly, as smart sustainable cities involve the generation of large, varied, and time-based data pertaining to such urban domains as transport, traffic, mobility, energy, environment, land use, healthcare, education, and so on, huge data management capabilities are necessary to allow to make sense of these data. Especially the field of urban sustainability necessitates these domains to be interrelated and coordinated to collaborate and inform one another. In this respect, the urban data are generated on a regular basis in the form of massive repositories, i.e. huge amounts of data on environmental and socio-economic aspects of urban areas, which provide a powerful microscope of, and a real-time view of what is happening in, the city as to sustainability performance across several spatial scales and over multiple temporal scales. A successful utilization of these valuable data in smart sustainable cities requires advanced big data management tools and methods. This entails the development and implementation of scalable and powerful architectures, best practices, and dedicated computational processes for properly managing data lifecycle throughout various phases of data use, particularly in terms of addressing the issue of variety and velocity, i.e. recognizing their different formats and sources as well as organizing, cataloguing, classifying, and controlling all classes and structures of data. In addition, for smart sustainable city applications, big data management should provide tools for scalable handling of massive data to serve real-time applications and support offline applications (see [1]). For the interested reader, there are several studies that have addressed the topic of big data management (e.g. [84,85,89]) in terms of concepts, approaches, techniques, and challenges.
Advanced big data analytics techniques and algorithms
In smart sustainable cities, big data analytics should involve highly sophisticated and dedicated techniques and algorithms (data mining, machine learning, statistics, database query, etc.) that can perform complex computational processing of data for timely and accurate decision-making purposes. Traditional techniques and algorithms are inadequate for handling big data associated with smart sustainable city applications due to their high-volume, high-variety, and high-velocity. Urban big data necessitate high speed processing power and high performance to obtain useful results necessary to enhance decision-making pertaining the urban operational functioning and planning of smart sustainable cities. Therefore, existing techniques and algorithms need to be improved in ways that can handle the extreme volume of data, the wide variety of data types, and the time constraints on data processing. In particular, data mining algorithms and techniques are by far unfit for handling big data because they are designed to deal with limited and well-defined datasets (e.g. [90]). In the context of smart sustainable cities, such techniques and algorithms need to be exploited, enhanced, and extended in order to yield the desired outcomes in terms of extracting the useful knowledge (patterns and correlations) necessary for improving sustainability performance (see, e.g. [2, 5, 6]. Alternative or novel solutions in this regard are required to be designed with more scalability and flexibility to handle dynamic and real-time aspects of big data applications for smart sustainable cities, among other things. Moreover, they are to operate as an integral part of cloud computing (PaaS) and thus collaborate across diverse networks for aggregating, fusing, processing, analyzing, and visualizing data collected from countless sensing devices from multiple sources, stored in massive repositories, and coordinated through smart networks. In other words, they need to work effectively across disparate networks, dispersed infrastructures, distributed geographical locations, and heterogeneous computing environments, as well as to be capable of operating in highly scalable and dynamic settings, to reiterate. New approaches to storing, managing, coordinating, and analyzing big data, in particular in relation to smart sustainable city applications should rely on advanced artificial intelligence programs and machine learning techniques. This is in contrast to loading big data into traditional relational databases for analysis, a process that relies on data schema and is time consuming and computationally expensive. For a detailed account of big data analytics techniques and algorithms from a general perspective, the interested reader might want to read Provost and Fawcett [57]. For a relevant account of data mining techniques and algorithms, the interested reader can be directed to Barbi [55]. Also, Chen et al. [58] provide a thorough survey on data mining techniques and algorithms.
Privacy mechanisms and security measures
It is highly important to ensure that all technological components associated with big data and context-aware applications for smart sustainable cities are supported by security measures and privacy mechanisms. It is essential to control big data [91] and context data [92]. Massive repositories of urban data are at stake, and failure to protect these data will pose risks and threats to the functioning of smart sustainable cities as well as to the safety and well-being of their citizens on several scales. Therefore, security measures and privacy mechanisms should be at the core of urban policy and governance practice associated with the design, development, deployment, and implementation of big data and context-aware applications within smart sustainable cities. Any attempt of an unauthorized access, malicious attack, or abuse of information on citizens, infrastructures, networks, and facilities can compromise the integrity of such applications and related services. Smart sustainable cities generate colossal amounts of data on virtually every urban process, which are to be stored, processed, and shared. Urban environments ‘are now being continually forged and re-forged in (sensorial), informational, and communicative processes. It is a world where…cities think of us, where the environment reflexively monitors our behavior’ ([93], p. 1), including whether and the extent to which we behave in a sustainable way through the activities we perform in cities.
However, it is commonly held views that the more cities think of and know about us and technologies monitor urban environments and collect information, ‘the larger becomes the privacy threats, and the larger…the networks, the higher the security risks’ ([92], p. 218). When sensing, computing, and networks become ubiquitous, ‘when everything is embedded with intelligence and connected to everything else via the internet and other networks, the threats and vulnerabilities will become even greater than they are nowadays’ ([92], p. 218). There is a need for technological safeguards as a response to the risks posed by the emerging urban trends of big data analytics and context-aware computing. Clear guidelines, recommendations, and requirements must be identified and put in place in relation to big data and context-aware applications for smart sustainable cities. Among the privacy mechanisms proposed thus far for addressing the issue of privacy include ‘anonymity, pseudonymity, unlinkability, and unobservability’, yet they need to be ‘fully developed, evaluated, and instantiated in their operating environment to test their performance—how well they work’ [92]. Big data and context-aware applications for smart sustainable cities require the development of more robust, if not unconventional, privacy-protecting safeguards by considering the most likely ways through which the information from different urban domains can be leaked and breached. As regards to the security, the scientific challenges ‘include methods supporting the evaluation of risk exposure…, security design principles to enable control of the risk exposure, methods for…security analysis, security of big (and context) data…, secure cloud of physical and smart things, cyber physical systems security, lightweight security solutions, authentication and access control…, identification and biometrics…, cyber-attacks detection and prevention, and so on’ ([92], p. 223–4). While information security risks are of diverse nature, including ‘modification, destruction, theft, or lack of availability of computer assets such as hardware, software, data, and services’ ([94], p. 442), integrity and confidentiality—i.e. protection of information from modification and unauthorized use—should be more of focus as categories of security threats in the ICT of the new wave of computing networks than in the traditional networks [92]. This is due to the fact that there are ‘possible conflict of interests between communicating entities; network convergence; large number of ad-hoc communications; small size and autonomous mode of operation of devices; and resource constraints of mobile devices’ ([95], p. 50). Of critical importance, nevertheless, is to develop a new security paradigm which supports advanced features of context-aware technology, as conventional password entry schemes using traditional input devices have proven to be vulnerable to attacks. To address these issues, new research endeavors are focusing on such new techniques as authenticating with minds; pointing and selection using gaze and keyboard; and gaze-based user authentication [92]. In relation to this, Wright et al. [95] suggest some research directions, including ‘improving access control methods by multimodal fusion, context-aware authentication and unobtrusive biometric modalities’, and ‘increasing security by detection of unusual patterns.’
Standards and open standards
It is important to follow standards when it comes to data integration to make sense of data proliferation as well as to ensure data quality. Standard rules are also needed for evaluating the accuracy and correctness of data and for dealing with such issues as uncertainty and incompleteness of data, especially in relation to real-time big data (and context-aware) applications which require the data to be described using advanced models of the very urban systems that that data are associated with in case of missing and inconsistent data. It is of equal importance to set and comply with standard rules with respect to new applications for advancing urban sustainability to achieve seamless integration between the available urban systems (in terms of infrastructural, physical, operational, and functional forms) and the introduced big data (and context-aware) applications across diverse urban domains. In this regard, the way forward is to carry out a thorough investigation of the different urban entities and actors as well as the infrastructure, built form, administration, and ecosystem and human services as to their operation as urban systems to strategically assess the benefits of new solutions and the readiness of urban stakeholders to join any smart movement associated with improving urban sustainability. In light of such investigation, new practices, regulations, and standard models of design and rules can be developed for big data and context-aware applications for smart sustainable cities.
Concerning other areas related to big data and context-aware applications for smart sustainable cities, it can be advantageous to pursue open standards for designing and implementing solutions with respect to various urban domains, as such applications involve large-scale and heterogeneous data systems. The rationale behind open standards in this respect is to provide some flexibility for scaling up, upgrading, improving, and maintaining applications for smart sustainable cities, as new challenges are most likely to emerge and thus operative solutions may become inadequate to handle potential complexities and difficulties as to translating sustainability into the built, infrastructural, operational, and functional forms of such cities.