Skip to main content

Table 2 Column-oriented databases

From: Modeling temporal aspects of sensor data for MongoDB NoSQL database

Name Data model Scalability Description Who uses it
Cassandra from Facebook, Apache [57] Multi-dimensional column family is a set of rows; Partition (single or more row) key: identify a partition; Row key: identify row in column family Partitioning, replication, availability Multi-master no point of failure; in-memory with disk persistence; masterless; query method: CQL anf Thrift; MapReduce; secondary indexing; eventual consistency CERN, Comcast, GitHub, GoDaddy, Hulu,eBay Netflix
HBase from Apache [72] Tables have rows and columns; rows: row key and one or more columns; column: consist of column family Auto sharding; asynchronous replication; availability BigTable based; Hadoop Distributed File System (HDFS); MapReduce; consistent read/writes; failover support; Thrift anf REST-ful; ZooKeeper Adobe, Kakao, Facebook, Flurry, LinkedIn, Netflix, Sears
Druid from http://druid.io Columns are one of three types: a timestamp, a dimension, or a measure. Nested dimensions Low latency; replication; sharding Highly optimized for scans and aggregates ; MapReduce; fault-tolerant; ZooKeeper; index structures, not ACID Alibaba, Cisco, eBay Netflix, Paypal, Yahoo
Accumulo from Apache [73] Keys-values both byte arrays, timestamp as long; adds new key element of column visibility; sorts keys by element, secondary indexes Sharding, replication, persistence, fault tolerant BigTable based Java technology, top of Hadoop, ZooKeeper and Thrift; Map-Reduce; Zookeeper (multi-master) locks for consistency; cell-level access US National Security Agency (NSA)