Key-Value Stores in Ruby: The Wrap Up »
Created at: 17.11.2009 20:00, source: Engine Yard Blog, tagged: Technology couchdb javascript key-value stores mongodb ruby s3
This last article in our key-value series will briefly cover a few interesting topics that could each have had full articles of their own. This means that if they seem interesting to you, follow the links that I provide to get more information on them. Lastly, I’ll wrap up by introducing Moneta, written by Yehuda Katz, which provides a unified API for a wide variety of different Key-Value Stores. If you want to write code that allows the user to choose the store to use, you’ll want to pay attention to Moneta.
The difficult part of discussing Key-Value Stores stores today is that it’s a product area seeing rapid development and constant evolution. There are more interesting stores and libraries available than can easily be covered, even in a series like this. I could probably be writing posts every two weeks into next year without running out of subjects. So, alas, many things must be left undiscussed or underdiscussed. But let’s move on to the topics we can cover…
CouchDB
The first great Key-Value Store that isn’t going to get its own article is CouchDB. Apache’s CouchDB is a document-oriented database, like MongoDB. It, however, exposes a RESTful JSON based API that you address with a built in HTTP interface. Like MongoDB, it offers a schema free data store. CouchDB offers solid, built-in replication, and uses JavaScript as its query language. It is a powerful tool.
There are several Ruby libraries which can be used to facilitate using CouchDB. In the examples below, I have used CouchRest, which is based on CouchDB’s own couch.js library:
require 'rubygems'
require 'couchrest'
require 'yaml'
DBH = CouchRest.database!('exercise-log')
response = DBH.save_doc({
:date => Time.now,
:activity => ARGV[0],
:duration => ARGV[1]})
stored_record = DBH.get(response['id'])
puts "Stored:\n#{stored_record.to_yaml}"
wyhaines$ ruby /tmp/couch1.rb Stored: --- !map:CouchRest::Document duration: "97:34" _rev: 1-eb6f6e3a3e2eae0cd99f3fcbc63d29d6 _id: 0d9e71f44b3e0d3a2013c282bbccb5a0 activity: pedaling date: 2009/11/12 21:07:45 +0000
Like MongdoDB, one can store any set of keys/values together as a document in CouchDB, and then retrieve it later. CouchRest returns a response from the server that contains an id field, which can be used to retrieve the record that was just stored.
For more complex queries of the document store, one can use views. Views have a lot of power, because they are ultimately defined using JavaScript, but they don’t lend themselves to easy ad-hoc manipulation of the database.
DBH.save_doc({
"_id" => "_design/query",
:views => {
:allkeys => {
:map => "function(doc) { for (var word in doc) { if (!word.match(/^_/)) emit(word,doc[word])}}"
}
}
})
That inserts a view into the database that will be identified by query/allkeys. What a view does is defined by the JavaScript code it contains. Once a view is inserted into CouchDB, using it is simple:
puts DBH.view('query/allkeys').to_yaml
That particular function was lifted shamelessly from the CouchRest README, and just has a couple terms renamed to make it a little more clear. The output:
--- total_rows: 3 rows: - id: 0d9e71f44b3e0d3a2013c282bbccb5a0 value: pedaling key: activity - id: 0d9e71f44b3e0d3a2013c282bbccb5a0 value: 2009/11/12 21:07:45 +0000 key: date - id: 0d9e71f44b3e0d3a2013c282bbccb5a0 value: "97:34" key: duration offset: 0
This is really just the tip of the iceberg with CouchDB/CouchRest; there’s a wealth of functionality. CouchDB views are implemented with map/reduce capability, which means you can use them to crunch some pretty complex problems on your data. Additionally, CouchRest provides a CouchRest::ExtendedDocument, which your own classes can inherit from. This lets you easily create a Ruby model for your data, which is then transparently stored inside CouchDB.
class Exercise "running", :date => Time.now, :duration => "23:44")
Dig into the CouchDB and CouchRest documentation if this looks interesting to you.
S3
I just wanted to briefly mention Amazon’s Simple Storage Service. It is, fundamentally, a simple HTTP accessible Key-Value Store that Amazon has turned into a service. Requests to S3 will have higher latency than requests to a locally hosted data store (and its response latency can be high too), but if you want a simple, robust store that will scale to as much data as you have to push at it, you might seriously consider S3.
Moneta
Moneta is a unified interface to a variety of different key-value type data stores. That is, the same code can be run against a variety of different backing stores, and it will just work. Moneta supports the following stores as of this posting:
- Basic File Store
- BerkeleyDB
- CouchDB
- DataMapper
- File store for xattr
- In-memory store
- Memcache store
- Redis
- S3
- SDBM
- Tokyo
- Xattrs in a file system
Consider this example, which, again, uses CouchDB:
irb(main):003:0> require 'moneta/couch'
require 'rubygems'
require 'yaml'
require 'moneta'
require 'moneta/couch'
cache = Moneta::Couch.new(:db => 'football')
cache['1a_final'] = {
:where => 'Laramie; War Memorial Stadium',
:when => "11:30 MST",
:who => "Southeast Cyclones & Lingle-Ft. Laramie Doggers",
:prediction => "SE Cyclones by 14"}
puts cache['1a_final'].inspect
wyhaines$ ruby /tmp/moneta1.rb --- - prediction: SE Cyclones by 14 when: 11:30 MST who: Southeast Cyclones & Lingle-Ft. Laramie Doggers where: Laramie; War Memorial Stadium
It works, very simply. If I want to change the code to use something else, like a file based store, it’s as simple as changing one line:
--- couch.rb 2009-11-19 15:00:07.000000000 -0700
+++ file.rb 2009-11-19 15:01:12.000000000 -0700
@@ -1,9 +1,9 @@
require 'rubygems'
require 'yaml'
require 'moneta'
-require 'moneta/couch'
+require 'moneta/file'
-cache = Moneta::Couch.new(:db => 'football')
+cache = Moneta::File.new(:path => '/tmp/football')
cache['1a_final'] = {
:where => 'Laramie; War Memorial Stadium',
The rest of the code works without alteration. The Moneta API is designed to be very similar to that of Hash. It has a limited feature set, but the features it provides work identically across all of the supported platforms. For example, it doesn’t currently support iteration or partial matches. If your Key-Value Store needs are simple and you want something that can work with whatever store your users want to use, definitely check out Moneta; it’s a well written tool.
With that, we’ve reached the end of this series. It’s been fun to explore the unique features, as well as the threads that unify each of these different approaches to the problem, on a non-SQL key-value type data store. I hope that I’ve exposed you to new and useful tools.
The landscape of Key-Value Stores is changing rapidly, so it is difficult to stay fully informed all the time. For instance, just a couple days ago there was a blog post implementing a SQL front end for CouchDB. It’s done in Perl, but all it would take is an interested person and a little time, and you could have it in Ruby, too.
If you use a Key-Value Store system, or plan to, keep your eyes open for new developments, because you can bet that someone else will have something interesting next week or next month that may change the landscape again. As always, leave feedback in the comments, and thanks for reading!
more »
You're An Idiot For Not Using Heroku »
Created at: 08.11.2009 19:17, source: RailsTips - Home, tagged: heroku hosting mongodb mongohq
In which I discuss my first experience with Heroku and my second. And how awesome it is.
It is true. You are. Go try it now. That is an order. I can wait for you to come back and finish reading this post. I could end the post now, but I suppose I’ll go on and tell you a bit about my experience with Heroku yesterday.
Formerly a Toy in the Cloud
Wynn and I were talking yesterday about how, back in the day, Heroku seemed like a toy in the cloud. They had a rich code editor and you could magically create and deploy applications that sometimes worked. It was neat, but nothing you would use for anything serious.
A toy they are no more. So what is Heroku? According to their site, Heroku is “fast, frictionless, and maintenance free.” After giving it another look yesterday, I would have to agree.
The App
I have a tiny note application that my wife and I use. I use it to mark things to read later and save plain text notes. She uses it to keep track of recipes, tagged with ingredients and whether or not she has made the recipe before. It is nothing fancy, but it serves a purpose for both of us.

The app formerly ran on Dreamhost (how to deploy rails on DH) and used MySQL. Since I decided not to attend the Notre Dame game yesterday, I had some free time, so I watched football on TV all day and worked on converting this project from MySQL to MongoDB (which is awesome).
Once I finished the conversion, which didn’t take long, I exported the MySQL database as XML using PHPMyAdmin (shutter) and then wrote an import rake task that reconnected the xml in MongoDB (which is awesome).
MongoHQ
I have had a MongoHQ invite for a while now, but hadn’t kicked the tired so I decided now was as good a time as any. Then it occurred to me. Why use Dreamhost when Heroku has a free account and I’m already hosting my database in the sky? Why not go cloud to the max and see how things end up?
Heroku
I logged in with my old Heroku account and did some reading through their amazing docs.
- I gem installed heroku.
- heroku created my app using the command line tool.
- git pushed to heroku remote.
Boom. In less than a minute my app was created and deployed on Heroku. Impressive. Now that isn’t where the story ended. Hosting on Heroku is a bit different.
Config Vars
The first thing I ran into was some config file issues. I found Heroku’s article on config vars and switched my app to work like that. git push and my app was deployed again.
Gems
Now I was missing gems. Back to the docs I went, this time to read about managing gems. I created my .gems manifest and git pushed again. Just like that my app was up and running in the sky.
Conclusion
I made a few more changes to my app over the next few hours and deployed after each one with a simple git push heroku master. Each time, I almost giggled as the normal git messages happened and then out of nowhere, Heroku stepped in and informed me that it was deploying my app and…wait for it…wait for it…that the deploy was finished.
Now that I’ve used it for a tiny app, I’m curious to see what it can do with something larger. I’ll definitely be using Heroku a lot in the future, that much I know for sure. Combined with a hosted MongoDB service, it is absolute glory. MongoDB having their GridFS file store, means that not having write access to a file system on Heroku is no big deal. You don’t even have to setup S3.
I’ll leave you with my tweet from yesterday, summing up my experience.
Created and deployed a MongoDB backed Rails app to Heroku and MongoHQ today. I have witnessed the future.
Anyone else out there using Heroku? What kind of apps have you deployed on it? What have your experiences been? Curious to hear from others.
more »
You're An Idiot For Not Using Heroku »
Created at: 08.11.2009 19:17, source: RailsTips - Home, tagged: mongodb mongohq hosting heroku
It is true. You are. Go try it now. That is an order. I can wait for you to come back and finish reading this post. I could end the post now, but I suppose I’ll go on and tell you a bit about my experience with Heroku yesterday.
Formerly a Toy in the Cloud
Wynn and I were talking yesterday about how, back in the day, Heroku seemed like a toy in the cloud. They had a rich code editor and you could magically create and deploy applications that sometimes worked. It was neat, but nothing you would use for anything serious.
A toy they are no more. So what is Heroku? According to their site, Heroku is “fast, frictionless, and maintenance free.” After giving it another look yesterday, I would have to agree.
The App
I have a tiny note application that my wife and I use. I use it to mark things to read later and save plain text notes. She uses it to keep track of recipes, tagged with ingredients and whether or not she has made the recipe before. It is nothing fancy, but it serves a purpose for both of us.

The app formerly ran on Dreamhost (how to deploy rails on DH) and used MySQL. Since I decided not to attend the Notre Dame game yesterday, I had some free time, so I watched football on TV all day and worked on converting this project from MySQL to MongoDB (which is awesome).
Once I finished the conversion, which didn’t take long, I exported the MySQL database as XML using PHPMyAdmin (shutter) and then wrote an import rake task that reconnected the xml in MongoDB (which is awesome).
MongoHQ
I have had a MongoHQ invite for a while now, but hadn’t kicked the tired so I decided now was as good a time as any. Then it occurred to me. Why use Dreamhost when Heroku has a free account and I’m already hosting my database in the sky? Why not go cloud to the max and see how things end up?
Heroku
I logged in with my old Heroku account and did some reading through their amazing docs.
- I gem installed heroku.
- heroku created my app using the command line tool.
- git pushed to heroku remote.
Boom. In less than a minute my app was created and deployed on Heroku. Impressive. Now that isn’t where the story ended. Hosting on Heroku is a bit different.
Config Vars
The first thing I ran into was some config file issues. I found Heroku’s article on config vars and switched my app to work like that. git push and my app was deployed again.
Gems
Now I was missing gems. Back to the docs I went, this time to read about managing gems. I created my .gems manifest and git pushed again. Just like that my app was up and running in the sky.
Conclusion
I made a few more changes to my app over the next few hours and deployed after each one with a simple git push heroku master. Each time, I almost giggled as the normal git messages happened and then out of nowhere, Heroku stepped in and informed me that it was deploying my app and…wait for it…wait for it…that the deploy was finished.
Now that I’ve used it for a tiny app, I’m curious to see what it can do with something larger. I’ll definitely be using Heroku a lot in the future, that much I know for sure. Combined with a hosted MongoDB service, it is absolute glory. MongoDB having their GridFS file store, means that not having write access to a file system on Heroku is no big deal. You don’t even have to setup S3.
I’ll leave you with my tweet from yesterday, summing up my experience.
Created and deployed a MongoDB backed Rails app to Heroku and MongoHQ today. I have witnessed the future.
Anyone else out there using Heroku? What kind of apps have you deployed on it? What have your experiences been? Curious to hear from others.
more »
More MongoMapper Awesomeness »
Created at: 09.10.2009 16:46, source: RailsTips - Home, tagged: mongodb mongomapper
In which I dish on the latest MongoMapper features like dirty attributes, time zone support, custom data types and dynamic finders.
September was a month of craziness and for the first month in quite a while I did not post here. I promise it hurt me as much as it hurt you. In an effort to get back in the rhythm, I am going to start with an easy article. MongoMapper has been getting a lot of love lately and I thought I would mention some of the awesomeness.
Dynamic Finders
Dynamic finders are so darn handy in ActiveRecord. How many times have you used User.find_by_email and the like? Thankfully David Cuadrado took a stab at it. I took what he started, tested it a bit harder and added it onto document associations as well. This means when you have a document with a many documents association, you can now use dynamic finders that are scoped to that association.
class User
include MongoMapper::Document
many :posts
end
class Post
include MongoMapper::Document
key :user_id, String
key :title, String
end
user = User.create
user.posts.create(:title => 'Foo')
# would return post we just created
user.posts.find_by_title('Foo')
Document associations now also have all the normal Rails association methods such as build, create, find, etc.
Logging
The mongo ruby driver added logging support so a few days ago, I added some basic support for accessing and using that logger from within MongoMapper. When you pass a logger instance to the ruby driver, you can access that connections logger instance from MongoMapper.logger like so:
logger = Logger.new('test.log')
MongoMapper.connection = Mongo::Connection.new('127.0.0.1', 27017, :logger => logger)
MongoMapper.logger # would be equal to logger
Tailing the log would give you output like the following:
MONGODB db.$cmd.find({"count"=>"statuses", "query"=>{"project_id"=>"4aceaabed072c4745f0003ca"}, "fields"=>nil})
MONGODB db.$cmd.find({"count"=>"statuses", "query"=>{"project_id"=>"4aceaabed072c4745f0003ce"}, "fields"=>nil})
The nifty part about this is you can setup your Mongo::Connection to use Rails.logger and then all your mongo queries show up in your Rails logs if you have your log level set low enough. This has been very handy for me working on MongoMapper because I can see exactly what MM is sending to Mongo behind the scenes.
Because of this addition, I noticed that every find(:first) was using :order => ’$natural’ which doesn’t allow using indexes and leads to slow queries. I removed the default order so instead it is just a find with a limit of 1, which should help make a few parts perform better.
Dirty Attributes
ActiveRecord’s dirty attributes is such a cool feature that yesterday, I spent a few hours porting it to MongoMapper::Document. Now you can do things like:
class Foo
include MongoMapper::Document
key :phrase, String
end
foo = Foo.new
foo.changed? # false
foo.phrase_changed? # false
foo.phrase = 'Dirty!'
foo.changed? # true
foo.phrase_changed? # true
foo.phrase_change # [nil, 'Dirty!']
I’m sure there will be edge cases, but as we find them we can fortify the tests and go from there.
Custom Data Types
With the 0.4 release came the transition from typecasting to custom data types. Now, instead of natively defining typecasting for “allowed” data types, you can have any data type that you like. You just have to do the conversion to and from mongo yourself. Making your own data types is as simple as:
class Foo
def self.to_mongo(value)
# convert value to a mongo safe data type
end
def self.from_mongo(value)
# convert value from a mongo safe data type to your custom data type
end
end
class Thing
include MongoMapper::Document
key :name, Foo
end
This means each time the name of Thing is saved to mongo or pulled out of mongo it will be ran through the Foo#to_mongo and Foo#from_mongo to make sure it is exactly what you want it to be.
Out of the box, MongoMapper supports Array, Binary, Boolean, Date, Float, Hash, Integer, String, and Time. You can check out the support file and tests to see how this works.
Time Zones
One not on times, since I mentioned it above is that all times are stored in the datbase as utc now. Also, if you have Time.zone set, all times are converted to the current time zone going to and from the database. This actually turned out to be really easy. We’ll see if I did it all correctly once people start pounding on it I guess. :)
Lazy Loading
One thing that I’ve been working on in between other features is making MongoMapper more lazy. I have already made connection, database and collection lazy so MM doesn’t actually create the connection or connection to the database until needed which makes MM work a lot better with Rails.
I still need to make indexes lazy, so that is the next thing to tackle. I’m thinking once that is in, I’ll have something like MongoMapper.ensure_indexes!, similar to DataMapper.auto_migrate!, which actually ensures the indexes exist rather than doing that the second a class loads.
Internal Improvements
Along with all the public features, I have been working on the internals of MM whenever I get a chance. They still need cleaning up, but things are getting better. Along with some refactoring, I did some work to speed the tests up.
The tests were starting to creep up to around 40 seconds which was driving me nuts. I did a bit of work and realized that clearing every collection before every test was causing most of the slowdown so I pruned the functional tests to only clear the collections that were actually used in that test. This cut the time from around 40 seconds to 10. Yep, huge!
Conclusion
There are still rough parts and I would recommend MongoMapper for beginners, but if you can troubleshoot not only your own code but others, MM is in a good place for you. Up until now, I’ve been working on adding features that I needed similar to ActiveRecord, but I am almost to a place where I am going to start adding features to MM that can literally only exist because of MongoDB.
The next month is going to see some really cool things like upserts, modifiers ($set, $inc, $dec, $push, $pull, etc.) and the like make their way into MM. I also have some plans for an identity map implementation. Oooohs and aaaaaahs abound!
more »
More MongoMapper Awesomeness »
Created at: 09.10.2009 16:46, source: RailsTips - Home, tagged: mongodb mongomapper
September was a month of craziness and for the first month in quite a while I did not post here. I promise it hurt me as much as it hurt you. In an effort to get back in the rhythm, I am going to start with an easy article. MongoMapper has been getting a lot of love lately and I thought I would mention some of the awesomeness.
Dynamic Finders
Dynamic finders are so darn handy in ActiveRecord. How many times have you used User.find_by_email and the like? Thankfully David Cuadrado took a stab at it. I took what he started, tested it a bit harder and added it onto document associations as well. This means when you have a document with a many documents association, you can now use dynamic finders that are scoped to that association.
class User
include MongoMapper::Document
many :posts
end
class Post
include MongoMapper::Document
key :user_id, String
key :title, String
end
user = User.create
user.posts.create(:title => 'Foo')
# would return post we just created
user.posts.find_by_title('Foo')
Document associations now also have all the normal Rails association methods such as build, create, find, etc.
Logging
The mongo ruby driver added logging support so a few days ago, I added some basic support for accessing and using that logger from within MongoMapper. When you pass a logger instance to the ruby driver, you can access that connections logger instance from MongoMapper.logger like so:
logger = Logger.new('test.log')
MongoMapper.connection = Mongo::Connection.new('127.0.0.1', 27017, :logger => logger)
MongoMapper.logger # would be equal to logger
Tailing the log would give you output like the following:
MONGODB db.$cmd.find({"count"=>"statuses", "query"=>{"project_id"=>"4aceaabed072c4745f0003ca"}, "fields"=>nil})
MONGODB db.$cmd.find({"count"=>"statuses", "query"=>{"project_id"=>"4aceaabed072c4745f0003ce"}, "fields"=>nil})
The nifty part about this is you can setup your Mongo::Connection to use Rails.logger and then all your mongo queries show up in your Rails logs if you have your log level set low enough. This has been very handy for me working on MongoMapper because I can see exactly what MM is sending to Mongo behind the scenes.
Because of this addition, I noticed that every find(:first) was using :order => ‘$natural’ which doesn’t allow using indexes and leads to slow queries. I removed the default order so instead it is just a find with a limit of 1, which should help make a few parts perform better.
Dirty Attributes
ActiveRecord’s dirty attributes is such a cool feature that yesterday, I spent a few hours porting it to MongoMapper::Document. Now you can do things like:
class Foo
include MongoMapper::Document
key :phrase, String
end
foo = Foo.new
foo.changed? # false
foo.phrase_changed? # false
foo.phrase = 'Dirty!'
foo.changed? # true
foo.phrase_changed? # true
foo.phrase_change # [nil, 'Dirty!']
I’m sure there will be edge cases, but as we find them we can fortify the tests and go from there.
Custom Data Types
With the 0.4 release came the transition from typecasting to custom data types. Now, instead of natively defining typecasting for “allowed” data types, you can have any data type that you like. You just have to do the conversion to and from mongo yourself. Making your own data types is as simple as:
class Foo
def self.to_mongo(value)
# convert value to a mongo safe data type
end
def self.from_mongo(value)
# convert value from a mongo safe data type to your custom data type
end
end
class Thing
include MongoMapper::Document
key :name, Foo
end
This means each time the name of Thing is saved to mongo or pulled out of mongo it will be ran through the Foo#to_mongo and Foo#from_mongo to make sure it is exactly what you want it to be.
Out of the box, MongoMapper supports Array, Binary, Boolean, Date, Float, Hash, Integer, String, and Time. You can check out the support file and tests to see how this works.
Time Zones
One not on times, since I mentioned it above is that all times are stored in the datbase as utc now. Also, if you have Time.zone set, all times are converted to the current time zone going to and from the database. This actually turned out to be really easy. We’ll see if I did it all correctly once people start pounding on it I guess. :)
Lazy Loading
One thing that I’ve been working on in between other features is making MongoMapper more lazy. I have already made connection, database and collection lazy so MM doesn’t actually create the connection or connection to the database until needed which makes MM work a lot better with Rails.
I still need to make indexes lazy, so that is the next thing to tackle. I’m thinking once that is in, I’ll have something like MongoMapper.ensure_indexes!, similar to DataMapper.auto_migrate!, which actually ensures the indexes exist rather than doing that the second a class loads.
Internal Improvements
Along with all the public features, I have been working on the internals of MM whenever I get a chance. They still need cleaning up, but things are getting better. Along with some refactoring, I did some work to speed the tests up.
The tests were starting to creep up to around 40 seconds which was driving me nuts. I did a bit of work and realized that clearing every collection before every test was causing most of the slowdown so I pruned the functional tests to only clear the collections that were actually used in that test. This cut the time from around 40 seconds to 10. Yep, huge!
Conclusion
There are still rough parts and I would recommend MongoMapper for beginners, but if you can troubleshoot not only your own code but others, MM is in a good place for you. Up until now, I’ve been working on adding features that I needed similar to ActiveRecord, but I am almost to a place where I am going to start adding features to MM that can literally only exist because of MongoDB.
The next month is going to see some really cool things like upserts, modifiers ($set, $inc, $dec, $push, $pull, etc.) and the like make their way into MM. I also have some plans for an identity map implementation. Oooohs and aaaaaahs abound!
more »

