Blog Archives

Redis — NoSQL Data Store

Overview

As mentioned in my previous post “Next Generation Data Storage“, that NoSQL is not a silver bullet. It’s all about the right tool for the task as compared to all-purpose generic RDBMS. So the key here is to select the right data-store for your needs and developing good understanding of features provided by that data-store. In this post I will try to explore some wonderful set of features provided by Redis data store and will discuss some use case scenarios where Redis can make your life easier. Redis is an open-source, in-memory, persistent key/value data-store sponsored by VMWare. It is also referred to as the data-structure store as the keys can hold different data-structures as their value. Redis supports master-slave replication such that the data from any Redis server can be replicated to any number of slaves. Redis data store is written in ANSI C and it works on most POSIX systems like Linux, Mac OS X, FreeBSD, Solaris. The official release for Windows is not available but there are windows ports available using MinGW.

When to use Redis?

Although it depends a lot on the detailed requirements of you application but there are some scenarios when you should consider the option of using Redis data store. Consider the following situations,

  • Your data small enough to be contained in RAM
  • You need super high-performance key/value data-store
  • You can afford to lose some recent updates in case of system crash

In case the answer is yes then there is a fair chance that Redis can make your life simpler. Due to the fact that Redis is an in-memory store, it cannot hold data more than the available space in your RAM. You can configure Redis to use virtual memory but still all the keys must fit in RAM and besides that it is not a brilliant option to use Redis with virtual memory option. This limitation enables the Redis to provide super-fast performance as the data is directly served from the RAM.

Memory is the new disk. Disk is the new tape.
–Jim Grey

Redis provides two different approaches to persist the data depending upon your use case. One is semi-persistent durability mode achieved through snapshots where data from the memory is asynchronously dumped to the disk periodically, and the other one is the safer approach where each command is appended to the log. So if the your data can fit in memory and you can afford asynchronous writes then Redis can be a good choice.

The Redis Advantage

Now the most important part; How to take full advantage of Redis in your application. Before discussing some interesting ways to use Redis for common problems, I want to repeat two important points that I discussed before,

  1. It’s totally fair to use multiple data-stores in same project, so you can use some NoSQL data-store for some portion of your project and use different data-store or RDBMS for some other portion of your same project.
  2. When designing your application’s data model using a NoSQL data-store, you have to think different as compared to conventional RDBMS.

The best thing that I like about Redis is that you can offload lot of responsibility to your data-store. Redis is a data-structure storage system which is lot more than just a dumb data-store. You don’t need to follow the conventional way by fetching the data from database in your application and populating your data-structures before you process the data based on your algorithms. In Redis you store the data in form of different data-structures and you can perform some interesting operations on data. Following data-structures are supported by Redis,

  • Strings
    Support simple GET, SET, DEL operations.

    SET  key  value     /* Store the specified values for given Key */
    GET  key            /* Retrieves the value for given Key */
    DEL  key            /* Delete the given key */
  • Lists
    Support variety of List operations using simple commands,

    LPUSH myList 1      /* Prepend value '1' to list named myList */
    LLEN  myList        /* Get the length of list */
    LPOP  myList        /* Remove and get first element of the list */
    RPOP  myList        /* Remove and get last element of the list */
    LTRIM myList 0  1    /* Trim the list from index 0 to index 1 */
    LRANGE myList 0  2   /* Get all the values from index 0 to index 2*/
  • Sets
    Support many useful operations for SET manipulation,

    SADD set value /* Adds value to the SET represented by "set" */
    SREM set value /* Remove value from the specified SET */
    SMEMBERS set /* List all the members of specified SET */
    SISMEMBER set value /* Check if set contains the specified value */
    SUNION set1 set2 /* Take union of both sets */
    SINTER set1 set2 /* Take intersection of both sets */
    SDIFF set1 set2 /* Subtract the set2 members from set1 */
  • Sorted Sets
    Support many useful operations on sorted SET,

    ZADD set 10 value    /* Adds value to the SET with score 10 */
    ZRANGE set 0 2    /* Get values in sorted SET between index 0 and 2 */
    ZRANGEBYSCORE set 10 30   /* Get values in sorted SET with score 10 to 30 */ 
    ZINCRBY set 10 value    /* Increment score of value by 10 */

    Mostly retrieval functions are available with for operations on reverse sorted set. The functions start with ZREV… (e.g. ZREVRANGE etc..)

  • Hashes
    Support various functions for hash store manipulation,

    HSET key field value    /* Set value for field in hash store named "key" */
    HGET key field        /* Get value for field in hash store named "key" */
    SMSET key f1 v1 f2 v2 /*Set multiple fields of hash store represented by key*/
    HMGET key f1 f2 f3    /* Get multiple fields (f1,f2 and f3) from "key" */
    HKEYS key    /* List all fields in hash store represented by key */
  • Atomic increments and Expiration
    Support useful features like Atomic increments and expirations,

    INCR key    /* Increment integer value of key by 1 atomically */
    DECR key    /* Decrement integer value of key by 1 atomically */
    EXPIRE key 5    /* Expire key automatically after 5 seconds */
    TTL key        /* Get time to expire for specified key */

Let’s consider a Web-2.0 solution with social networking features like Facebook and see how Redis compares to conventional RDBMS.

Display recent messages

To start, we consider a very common scenario where you have to display the 10 recent messages on user’s homepage.

 1: /*Push the message id of new messages to the list*/
 2: LPUSH user:msgs {msg-id}
 3:
 4: /*Get latest message ids (0 to 10) from the list*/
 5: LRANGE user:msgs 0 10
 6:
 7: /*To get next 10 message ids from (10 to 20)*/
 8: LRANGE user:msgs 10 20

You can limit maximum number of messages for a user to 500 by adding a LTRIM after every LPUSH like,

 1: LPUSH user:msgs {msg-id}
 2: LTRIM user:msgs 0 500

This will ensure that a user’s inbox will have the latest 500 messages only.

Display Counters

Let’s say you want to display the counters on your website like Votes, Page Hits, File Downloads etc… You can use Atomic increment operations whenever your page is accessed or a file download is requested like,

 1: INCR counters:downloads:{file-id}
 2: INCR counters:downloads:{today-date}:{file-id}
 3: INCR counters:hits:{page-id}

If you counter already exists, it will be incremented else a new counter will be created and incremented. Please note that 2nd statement in above code will manage a file download counter representing the downloads every day separately. To display these counters in your application you can just get the value of any counter like,

 1: GET counters:downloads:{file-id}
 2: GET counters:downloads:{today-date}:{file-id}
 3: GET counters:downloads:{page-id}

Connections and Memberships

Let’s see what happens when a user adds a connection (Add a friend) or joins some group in your social network application. You can use SETS to manage this scenario by creating a SET for user’s connection (e.g: “users:{user-id}:connections”) and a SET for group members (e.g: “groups:{group-id}:members”) and whenever user joins a group you can add the user-id to the SET like,

 1: SADD users:{user-id}:connections   {user-id-of-connection}
 2: SADD groups:{group-id}:members  {user-id}

Now when user visits a group page and you need following information,

  • List of group members
  • Is user a member of this group?
  • All the connections of user who also joined this group

You can get this information by simple operations like,

 1: SMEMBERS groups:{group-id}:members
 2: SISMEMBER groups:{group-id}:members  {user-id}
 3: SINTER users:{user-id}:connections   groups:{group-id}:members

In above code example, we are using “SMEMBERS” command to list all the members of SET in the 1st line, and in 2nd line we are using “SISMEMBER” command to check if the user-id exists in given SET of group members. In the 3rd line of code we are taking the intersection of two sets (i.e. group members and user’s connections) to find the values that exist in both sets (Users that exist in current user’s connections and also the group members).

Now let’s say we import new contacts from email address book and create a new SET (e.g: users:{user-id}:contacts) for all these contacts. Assuming that we are using “user emails” as user-id in the SETS, here are few more operations

Display contacts that are not added as connections,

 1: /* Take Difference of two SETS */
 2: SDIFF users:{user-id}:contacts   users:{user-id}:connections

Display everyone including email contacts and connections in this social network,

 1: /* Take Union of two SETS */
 2: SUNION users:{user-id}:contacts  users:{user-id}:connections

Currently Nearby connections

Let’s extend the previously created SETS to display the connections or friends that maybe nearby (Feature like Google-Check-in or Facebook-Places ). For the purpose we will maintain some new SETS for registered places and we will display the connections that checked-in at some place nearby user in around 5 – 10 minutes. Now Consider the following example,

Whenever some user Check-In at some place we will add that user to the SET of that place,

 1: /*Roy checks-in to Fight Club*/
 2: SADD checkIn:FightClub:fresh  roy@ttl.com
 3:
 4: /*Ken checks-in to Fight Club*/
 5: SADD checkIn:FightClub:fresh  ken@ttl.com

After every 5 minutes we move the fresh set to the stale set to purge old check-ins and also we will store the union of fresh check-ins and stale check-ins to a new SET “checkIn:FightClub”. Please note that moving/renaming the fresh SET to stale SET will overwrite the existing contents of stale SET.

 1: RENAME checkIn:FightClub:fresh checkIn:FightClub:stale
 2: SUNIONSTORE checkIn:FightClub  checkin:FightClub:fresh checkIn:fightClub:stale

Now when a new user check-in to same place (i.e: FightClub), that user can check to see if some connection or friend is nearby,

 1: /*Now Blanka enters the Fight Club and checks if someone is nearby*/
 2: SINTER users:blanka@ttl.com:connections  checkIn:FightClub

The above code statement will give the Blanka’s network connections that are also member of Fight Club. And as we are refreshing the contents of Fight Club by rotating fresh and stale sets, so a user is automatically removed from Fight Club after 5 – 10 minutes of check-in.

Leaderboard Management

Another quick use can be a game application in your social network. When lot of people are playing the game and registering their scores, you may require to handle hundreds of Add and Update requests for user scores. For cases like that you can use Sorted Sets in Redis for a high performance implementation. Consider the following simple operations,

 1: /* User A got 10 points */
 2: ZINCRBY scores 10 A
 3: /* User B got 15 points */
 4: ZINCRBY scores 15 B
 5: /* User A got another 10 points */
 6: ZINCRBY scores 10 A
 7:
 8: /* Display top 10 scores */
 9: ZREVRANGE scores 0 9 WITHSCORES
 10:
 11: /* Display the rank of User A */
 12: ZRANK scores A
 13:
 14: /* Display the score of User A */
 15: ZSCORE scores A

Publish/Subscribe Pattern

Redis also provides high-performance implementation of Publish/Subscribe pattern which you can access this functionality using very simple commands like SUBSCRIBE, UNSUBSCRIBE, PUBLISH.

The Publish/Subscribe implementation also provide pattern-matching subscriptions. You can subscribe to a channel using “ PSUBSCRIBE log.* ”. This will enable you to receive event whenever a message is published matching “log.*” pattern (e.g: “PUBLISH log.Info” or “PUBLISH log.error”).

You can do a lot more with Redis like there is are blocking variants of List POP command in Redis that blocks if the list is empty, so you can use Redis Lists as Queues. Similarly you can use Redis Sorted Sets to implement Priority Queues.

Footnotes:

  1. http://redis.io/
    Redis