Not Only SQL (NoSQL)
Databases - Riak

سید احسان مختاری

آذرماه ۹۴

SQL is Good

  • Relational Databases Management Systems (RDMBSs) – mainstay of business
    SQL is good
  • Rich language
  • Easy to use
  • Rich tool set
  • Promise: ACID


  • Atomicity: all included statements in a transaction are either executed or  the whole transaction is aborted without affecting the database.
  • Consistency: a database is in a consistent state before and after a  transaction. 
  • Isolation: transactions can not see uncommitted changes in the database.
  • Durability: changes are written to a disk before a database commits a  transaction so that committed data cannot be lost through a power failure.

SQL Challenges

    Web­-based applications caused spikes.
  • Internet­-scale data size
  • High read­-write rates
  • Large data
    RDBMS were not designed to be distributed.

What is NoSQL

  • Class of non­-relational data storage systems.
  • All NoSQL offerings relax one or more of the ACID properties.
  • Social applications are not banks and they don't need the same level of ACID.

NoSQL History

  • It was first used in 1998 by Carlo Strozzi to name his relational  database that did not expose the standard SQL interface.
  • The term was picked up again in 2009 when a develper, Johan  Oskarsson, wanted to organize an event to discuss open­source  distributed databases.
  • The name attempted to label the emergence of a growing number of  non­relational, distributed data stores that often did not attempt to  provide ACID

Categories of NoSQL Databases

  • Key/Value stores
    • Dynamo, Scalaris, Berkeley DB, ...
  • Column­-oriented databases
    • BigTable, Hbase, Cassandra, ...
  • Document databases
    • MongoDB, Terrastore, Riak, ...



  • Strong consistency
    • Single storage image. Informally, after an update completes,
       any subsequent  access will return the updated value.


  • Eventual consistency
    • The system does not guarantee that subsequent accesses will return the 
      updated  value.
    • Inconsistency window
    • If no new updates are made to the object, eventually all accesses
       will return the last updated value.

Quorum Model

  • Quorum: the minimum number of members of an assembly or society that must be present at any of its meetings to make the proceedings of that meeting valid.
  • N, R and W
    • user can choose some number W-of-N nodes required for a write to succeed
    • user can specify the number of nodes (R-of-N) to be contacted during a read.
  • To enforce strong consistency: R + W > N
  • N is rarely more than 3: keeping that amount of copy of data gets expensive!
    • Basho's Riak (N = 3, R = 2, W = 2 default)
    • Linkedin's Voldemort (N = 2 or 3, R = 1, W = 1 default)
    • Apache's Cassandra (N = 3, R = 1, W = 1 default)

BASE Properties

  • Basically Available: possibilities of faults but not a fault of the whole system.
  • Soft state: copies of a data item may be inconsistent.
  • Eventually consistent: copies becomes consistent at some later time if
     there are no more updates to that data item.

CAP Theorem

  • states that it is impossible for a distributed computer system to simultaneously provide all three of the following guarantees:
    • Consistency: All clients always have the same view of the data.
    • Availability: Each client can always read and write.
    • Partition Tolerance: The system works well despite physical network partition.
* A network partition refers to the failure of a network device that causes a network to be split.

CAP Theorem


  • Riak is a distributed, scalable, open source key/value store
  • originally developed by Basho Technologies to run a Salesforce automation business
  • heavily influenced by Dr. Eric Brewer’s CAP Theorem and Amazon’s Dynamo.
  • The Riak APIs
    • Storage operations use HTTP PUTs or POSTs and fetches use HTTP GETs
    • fully-featured Protocol Buffers API: a simple binary protocol based on the library Google’s open source project of the same name.
  • Client Libraries
    • Ruby, Java, Erlang, Python, PHP, and C/C++

Buckets, Keys, and Values

  • the only way to organize data inside of Riak.
  • Data is stored and referenced by bucket/key pairs.
  • Each key is attached to a unique value that can be any data type.


  • Central to any Riak cluster is a 160-bit integer space which is divided into equally-sized partitions.
  • Physical servers, referred to in the cluster as “nodes”, run a certain number of virtual nodes, or “vnodes”.
  • As a rule, each node in the cluster is responsible for 1/(total number of physical nodes) of the ring
  • #vnodes on each node = (number of partitions)/(number of nodes)
  • Nodes can be added and removed from the cluster dynamically and Riak will redistribute the data accordingly.



  • Riak controls how many replicas of a datum are stored via a setting called “N value”
  • All nodes in the same cluster should agree on and use the same N value.
  • Riak uses a technique called ‘hinted handoff’ to compensate for failed nodes in a cluster: self-healing

Reading, Writing, and Updating Data

  • Using the Riak APIs, Riak objects can be fetched directly if the client knows the bucket and key. This is the fastest way to get data out of Riak.
  • R Value: the number of Riak nodes which must return results for a read before the read is considered successful.
  • Read Failure Tolerance
    • R - N will tell you the number of down or laggy nodes a Riak cluster can tolerate before becoming unavailable for reads.
    • For example, an 8 node cluster with an N of 8 and a R of 1 will be able to tolerate up to 7 nodes being down before becoming unavailable for reads.

Reading, Writing, and Updating Data

  • Vector Clocks: Each update to a Riak object is tracked by a vector clock.
    • Vector clocks allow Riak to determine causal ordering and detect conflicts in a distributed system.
  • Conflict Resolution:
    • The last update automatically “win”
    • Return both versions of the object to the client: client itself resolve the conflict

Map Reduce

  • Allows to process data in real-time in parallel
    • MapReduce jobs are described in JSON using a set of nested hashes describing the inputs, phases , and timeout for a job.
      • A job can consist of an arbitrary number of Map and Reduce phases.
      • MapReduce in Riak can be thought of as a real-time “mini-Hadoop”
    • A job is submitted via HTTP and the results are returned in JSON-encoded form.

What NOT Covered

  • Secondary Indexes
  • Links and Link Walking
  • Riak Search
  • Commit Hooks
  • ...


Special thanks to: Amir H.Payberah


  • [Presentation] NoSQL Databases - Amir H.Payberah, April 2012
  • [Web]
  • [Web]
  • [Book] Seven Databases in Seven Weeks - A Guide to Modern Databases and the NoSQL Movement; Eric Redmond and Jim R. Wilson