Cache Money: Why Utilize Caching? »

Created at: 07.10.2011 01:41, source: Engine Yard Blog, tagged: Technology caching http caching memcached redis

Caching is extremely useful to implement for web applications. While it can be a good idea for the majority of web applications to utilize caching, there are times where caching is unnecessary and can be a time sink for developers. When is it a good idea to use caching? When an application is getting a lot of requests and New Relic detects a strain on your instances, it's probably time to look into caching.

There are a few different types of caching and some good resources to help decide which type is best for your application. We will look at memory caches using Memcached or Redis, and HTTP caches such as Varnish and Rack::Cache. If you are using Rails you can easily use its built-in caching. Check out the Rails Guide Caching with Rails for an overview.

Redis

Redis is a key-value store. With Redis, your data is held in memory and will be persisted to disk if necessary. This allows it to be useful for caching purposes. Redis is used by companies such as GitHub, craigslist, and here at Engine Yard.

Redis Slide

Image courtesy of Redis 101 from Peter Cooper

Let us look at how Redis can be useful for a web application.

Most web application requests return a variety of different lists such as posts, comments, followers, etc. The majority of key-value stores store these lists in single units (or a "blob"). As a result, most typical list related operations, such as adding an element, are inefficient. Fortunately, Redis has native list support which allows it to perform operations on lists very efficiently.

Counter caching in Rails allows you to accelerate performance by reducing the number of SQL queries and preventing unnecessary instantiation of objects, but the Rails implementation using generic SQL features does not scale well. Engine Yard customer MUBI uses Redis to replace Rails default counter caching with speedy Redis counters. Redis allows counter caches to be implemented with extremely fast, non-blocking atomic operations.

Let's explore the benefits of the counter caching using the example that Post has_many :comments and Comment belongs_to :post

>> Post.last.comments.size

This results in the following three SQL queries:

Post Load (0.4ms) SELECT * FROM `posts` LIMIT 1
Post Columns (2.9ms) SHOW FIELDS FROM `posts`
SQL (0.3ms) SELECT count(*) AS count_all FROM `comments` WHERE post_id = 1

Caching allows us to instead use a single query by adding the following relationship in comment.rb:

belongs_to :post, :counter_cache => true

We also have to update the Post table with a new attribute:

add_column :posts, :comments_count :integer

Now there is just one trip to the database to fetch the comment count:

User Load (0.4ms) SELECT * FROM `posts` ORDER BY posts.id DESC LIMIT 1

Why stop there? We can also utilize Redis for counter caching. When we do a count query in SQL, we can write the result to a Redis key. For example:

SELECT count(*) AS count_all FROM `comments` WHERE post_id = 1

is also written to Redis:

redis = Redis.new
redis.incr "post:1234:comments:count"

When pulling post records from the database we can test for the existence of relevant Redis keys for whatever counts we need. If they exist then we're done. If they do not exist, then they are looked up in MySQL and pushed into Redis, ready for next time. Also, since we are utilizing the redis incr command we avoid having to do a SQL query, except to initialize it, and since it is an atomic operation we can guarantee that the count always represents the exact number of times it was called without any race conditions.

Now some people have called Redis Memcached on steroids. However, it does not mean Memcached should be disregarded. There have been a lot of benchmarks done between Redis and Memcached and a lot of debate about the accuracy of those benchmarks. What it really comes down to is what you believe is best suited for your application.

Memcached

One difference to consider is Memcached does Least-Recently-Used (LRU) eviction of values from the cache. With Redis you only evict data when it's explicitly removed or expired, and it will store as much data as you put into it. Now in Redis 2.2 you can configure Redis using the maxmemory flag instead of setting expires so you can get LRU cache, but it is an option that you have to enable and is not the default. Memcached is being used by companies like Twitter, Reddit and Zynga. A final thing to consider is that Memcached is also integrated into Rails since Rails 2.1, making it even easier to use.

Memcached keeps the values in RAM so it's a transitory cache. Keep in mind that it discards the oldest values, so you cannot assume that data stored in Memcached will still be there when you need it. As stated earlier, it's very important to make sure it's right for your application because Memcached is slower than SELECT on localhost for small sites *; you should ensure you can keep up with the requests or it won't help you to use it.

Image courtesy of Redis 101 from Peter Cooper

A good use for Memcached is doing action caching. Action caching is a lot like page caching, but the flow is slightly different. With action caching the incoming web requests goes from the webserver to the Rails stack. One issue with page caching that the Rails guides goes through is that you cannot use it for pages that require to restrict access somehow.

Example: If you want to only allow authenticated users to edit or create a Product object, but still cache those pages:

class ProductsController < ActionController
  before_filter :authenticate, :only => [ :edit, :create ]
  caches_page :list
  caches_action :edit, :expires_in => 1.hours

  def list; end

  def create
    expire_page :action => :list
    expire_action :action => :edit
  end

  def edit; end

end

Also do not forget to setup your configuration for Memcached. There is a good amount of information from an older post that discusses when Rails 2.1 got better integrated caching.

# config/environments/production.rb

config.cache_store = :mem_cache_store, 'localhost:11211'

memcache_options = {
                        :c_threshold => 10,000,
                        :compression => false,
                        :debug => false,
                        :namespace => 'app-#{RAILS_ENV}',
                        :readonly => false,
                        :urlencode => false
}

CACHE = MemCache.new memcache_options

 

HTTP Caching

HTTP caching is another form of caching you can utilize. If you are not familiar with HTTP caching, this blog post offers a nice overview. Two useful projects worth checking out are Varnish and Rack::Cache. Now you might be asking "Why do people use them?" One reason is to reduce latency. In regards to latency, the request is satisfied from the cache, which takes less time, making the Web seem more responsive. For example, if you have dynamically generated content, using an HTTP cache like Varnish will result in better performance than using Memcached. This is because when using an HTTP cache your application server is not accessed for cache hits.

Varnish provides you with a default setup, which can be found in default.vcl that will work for most applications. However, you have the ability to really go in and customize it, which is recommended since Varnish assumes things that might not be correct about your application. The only work you have to do is ensuring your resources have appropriate HTTP caching parameters (Expires/max-age and ETag/Last-Modified). Do not forget to normalize the hostname to avoid caching the same resource multiple times. Some other things to remember is that Varnish was meant to run on 64-bit machines if you try it on 32-bit it will work, but you will definitely be running into some issues. Other recommendations I would make are keep your VCL simple and tune when you really need to using Varnish tips and best practices that others have found useful.

Another useful way to utilize HTTP caching is with Ryan Tomayko's Rack::Cache. A key aspect of Rack::Cache is the middleware piece that sits in the front of each backend process and does not require any infrastructure investment of a separate daemon process like Varnish.

Varnish has plenty of great examples and real world use cases on their site that can help you truly understand the usefulness it provides. Check them out to get a feel for how you can utilize it for your application. Also, if you want to test out Varnish on AppCloud take a look at the Chef recipe we have for it.

Now, it's up to you to decide whether all or some of these types of caching will be beneficial for you. Make sure to utilize them to the fullest potential to ensure you and your customers have the most enjoyable experience possible. If you have any caching experiences or gotchas, please share them in the comments.

Resources:

Redis 101 presentation
A Collection of Redis Use Cases
To Redis or Not To Redis?
Memcached Basics for Rails
Caching, Memcached and Rails
Scaling Rails with Memcached
Things Caches Do
Caching Tutorial
You're Doing It Wrong


more »

Redis-Flex: An ActionScript Library to integrate with Redis »

Created at: 13.06.2011 23:40, source: OnRails.org, tagged: Flex redis

Announcing redis flex An ActionScript Library to integrate with Redis.

A while back I looked into accessing Redis directly from Flex and I found an existing library, as3redis that however didn’t support the new unified request protocol. So I wrote a minimalist wrapper that now allows to send commands to a redis server.

To access the Redis server from Flex just instantiate a Redis instance:

    <redis:Redis id="server"
                 connected="server_connectedHandler(event)"
                 result="server_resultHandler(event)" />

Then you can send commands:

    server.send("SET A 123");
    server.send("GET A");
    server.send(["rpush", "messages", "message one"]);

Note it’s not a good idea to connect a Flex application directly to Redis. Redis is usually used in the context of an application server that protects it’s access in the same way that Flex doesn’t connect directly to a database. However they may be cases that this could be useful.

Enjoy!

Daniel


more »

Redis key-value store »

Created at: 18.03.2011 21:55, source: Ruby Rockers, tagged: Technology gem redis ruby store

Redis is really cool and lightweight key-value store. If you are looking for something in which you can store some string, hashes, lists, sets. The Redis is the best.

If you are a Ruby developer then you must try out this with a redis-rb gem. Very easy to configure, very easy to store things.

Following is the way of using redis in Ruby way.

To install

gem install redis

To load

require 'redis'

Before performing any operation with redis server you need to install Redis and start the redis server.

After that you can do like

redis = Redis.new # Automatically connect with the default port.

# if you have changed the port then you can specify the port in initialize.

>> redis.set "foo", "bar"
=> "OK"

>> redis.get "foo"
=> "bar"

Storing objects

>> redis.set "foo", [1, 2, 3].to_json
=> OK

>> JSON.parse(redis.get("foo"))
=> [1, 2, 3]

There are lot’s more things you can do with the Redis.

Links help you in redis

Redis official documentation (http://redis.io)
Github redis repository (https://github.com/antirez/redis)
Github redis rubygem repository (https://github.com/ezmobius/redis-rb)


more »

Pragmatic Polyglot Persistence with Rails »

Created at: 23.08.2010 12:50, source: Engine Yard Blog, tagged: Technology rails redis

This post comes from guest community contributor Kent Fenwick. Kent is the tech co-founder of of Viewpointr, a personalized Q&A service that aims to provide an easy way to get and give help. When he isn't programming, he spends time with his family and friends in Toronto. Kent writes here and can be followed on Twitter at @kentf.
It's getting more and more difficult to pick a persistence layer for your web application. When I started in Rails four years ago, there was really only one option, MySQL. Now, there are many more, each with their own pros and cons. Some are new and some are old, some are tested, and others, not so much. What's clear is that when you are building a business around data, you want to make good decisions. That being said, often only the future will tell if you've made the right ones. I want to share with you my persistence story about how I ended up getting the best of both worlds. h2. The Problem There are too many choices and each choice has a loud evangelist of its own. When designing Viewpointr I went go back and forth daily between MongoDB, MySQL, PostgreSQL and Cassandra. Viewpointr is essentially Twitter with a focus on helping people. Therefore, we have some common data elements: a user specific time line, a user specific list of people who they are helping, and a user specific list of people helping them. Because I am ambitious, I would find myself asking questions like: bq. "Hmm... but will MySQL scale to 1,000,000 records?" Looking back on these internal conversations I find them funny; programmers always tend to think big. However, these are real concerns that developers and teams think about. While planning I would constantly consult the blogosphere for help, and to see what others were doing. Kirk Haines of Engine Yard wrote a great series of NoSQL posts highlighting and comparing different key-value stores and explaining their pros and cons. Since then, there has been a flurry of articles each week outlining different NoSQL datastores, NoSQL vs. MySQL debates and flamewars etc. h2. The Opportunity Data is not created equal and this is a good thing. The same way we do not use an array for every "list" type problem when programming, sometimes hashes or linked lists will better suit the needs of the problem. We need to start thinking about data the same way. This was the best decision we made at Viewpointr and it allowed us to move forward at a great pace. I looked at our application and broke it down into components. Viewpointr has many typical CRUD features similar to all Rails apps. These are very well designed for MySQL and a relational database. Being able to pull a list of answers based on a given question using simple and optimized SQL that I understand is a big win. However, there are some things that it doesn't model well. Friendships. The simplest way to model friendship using a relational database is to create a relation that refers to the same table with two different names. Let's say you have a users table and you want to model Twitter-like friendship where User:1 can befriend User:2 without User:2's permission. It's easy enough.
class Friend < ActiveRecord::Base

 belongs_to :user
 belongs_to :contact, :class_name => "User", :foreign_key => "contact_id"

 # user befriends contact
 def self.befriend(user,contact)
    relationship = find_by_user_id_and_contact_id(user.id,friend.id)
    if relationship.nil?
      transaction do
        Friend.create(:user => user, :contact => contact)
      end
    end
 end

end

class User < ActiveRecord::Base

  has_many :friends, :dependent => :destroy
  has_many :contacts, :through => :friends, :order => "created_at DESC", :dependent => :destroy

end
However, I have always felt that it's clumsy. What I really want to say is: "Each user has a list of IDs that represent the people that they are friends with." Sounds like a de-normalized list right? h2. The Solution Enter Redis. Redis is a key-value store similar to memcached but more flexible since lists, sets, ordered sets and strings can all be used as values. Thanks to its simple API, the problem I described is essentially an atomic operation in Redis. Redis has a great "set" implementation and allows you to do all of the things you would imagine a set to do: addition, subtraction, unique insertion, deletion, union, intersection, etc. The operation will ultimately look like this:
SET = Redis.new
SET.set_add key, value
However, since we are working inside a Rails app, we need to make sure we have the right plumbing setup. # Create a redis.rb in your initializers folder. # Create a new Redis database for each of your needs. In our case, we want to have a dataset that keeps track of a User's helpers (other users who are helping them) and a list of a User's friends (other users that the user is helping). Since we are going to be using these Redis objects throughout the codebase, I like to declare them as global variables in the redis.rb initializer file.
HELPERS = Redis.new(:db => 0)
HELPING = Redis.new(:db => 1)
Notice that I pass in the :db key so that we make sure HELPERS and HELPING will hold two different Redis objects. You can use redis-namespace gem if you want, but I find the default syntax from the redis-rb gem works well enough for my purposes. Now that we have these global Redis objects at our disposal throughout the application, we can start using it in our Friend.befriend method.
class Friend < ActiveRecord::Base

 belongs_to :user
 belongs_to :contact, :class_name => "User", :foreign_key => "contact_id"

 # user befriends contact
 def self.befriend(user,contact)
    begin
     HELPERS.set_add contact.id, user.id
     HELPING.set_add user.id, contact.id
    rescue
     RedisLogger.info "Redis Exception"
    end
 end

end

class User < ActiveRecord::Base

  has_many :friends, :dependent => :destroy
  has_many :contacts, :through => :friends, :order => "created_at DESC", :dependent => :destroy

end
However, this isn't the best solution right out of the gate. Using a NoSQL datastore has some drawbacks that aren't apparent in development mode but reveals its ugly face in production. If you are not careful, a simple restart of your Redis server can cause you to loose all your data. Managing your Redis data in production deserves it's own post, (coming soon) but for now, let's create a safer solution that you can gradually roll out as you become more comfortable with storing, backing up and using Redis datafiles.
class Friend < ActiveRecord::Base

 belongs_to :user
 belongs_to :contact, :class_name => "User", :foreign_key => "contact_id"

 # user befriends contact
 def self.befriend(user,contact)
    relationship = find_by_user_id_and_contact_id(user.id,friend.id)
    if relationship.nil?
      transaction do
        Friend.create(:user => user, :contact => contact)
      end
    add_to_denormalized_list(user,contact)
    end
 end

  def self.add_to_denormalized_list(user,contact)
    begin
     HELPERS.set_add contact.id, user.id
     HELPING.set_add user.id, contact.id
    rescue e
      RedisLogger.info "Redis Exception"
    end
  end

end

class User < ActiveRecord::Base

  has_many :friends, :dependent => :destroy
  has_many :contacts, :through => :friends, :order => "created_at DESC", :dependent => :destroy

end
The strategy is simple, mirror the MySQL data in Redis. By adding a call to add_to_denormalized_list, we mirror the ActiveRecord call using the simple and elegant Redis set syntax discussed above. As you and your team get more practice and become more comfortable using Redis in production, you can start writing more to the denormalized list, eventually moving this part of your application away from ActiveRecord and MySQL to Redis. You could do this manually or you can use James Golick's recently released gem called Rollout that uses, you guessed it, Redis, to programatically rollout features to users. Like anything else you code, testing and benchmarking this process in production is crucial to make sure you are saving time and cycles. It might seem like a waste to duplicate your data in Redis, but you are a pragmatic polyglot persistence developer right? You want to explore the NoSQL space while making sure that a little mistake or misunderstanding doesn't sink your ship. Give something like this a try, it doesn't get any more pragmatic. When do you try it or come up with something new, let me and everyone else know about it. Thanks for reading.


more »

You got ur Erlang in my Ruby »

Created at: 30.04.2009 21:56, source: Brainspl.at, tagged: ruby erlang amqp nanite redis

Here are the slides from my keynote this morning at the SF Erlang Factory conference:


more »