Bundler and public applications »

Created at: 19.01.2012 23:24, source: Phusion Corporate Blog, tagged: ruby Ruby on Rails

I think Bundler is a great tool. Its strength lies not in its ability to install all the gems that you’ve specified, but in automatically figuring out a correct dependency graph so that nothing conflicts with each other, and in the fact that it gives you rock-solid guarantees that whatever gems you’re using in development is exactly what you get in production. No more weird gem version conflict errors.

This is awesome for most Ruby web apps that are meant to be used internally, e.g. things like Twitter, Basecamp, Union Station. Unfortunately, this strength also turns in a kind of weakness when it comes to public apps like Redmine and Juvia. These apps typically allow the user to choose their database driver through config/database.yml. However the driver must also be specified inside Gemfile, otherwise the app cannot load it. The result is that the user has to edit both database.yml and Gemfile, which introduces the following problems:

  • The user may not necessarily be a Ruby programmer. The Gemfile will confuse him.
  • The user is not able to use the Gemfile.lock that the developer has provided. This makes installing in deployment mode with the developer-provided Gemfile.lock impossible.

This can be worked around in a very messy form with groups. For example:

group :driver_sqlite do
  gem 'sqlite3'
end

group :driver_mysql do
  gem 'msyql'
end

group :driver_postgresql do
  gem 'pg'
end

And then, if the user chose to use MySQL:

bundle install --without='driver_postgresql driver_sqlite'

This is messy because you have to exclude all the things you don’t want. If the app supports 10 database drivers then the user has to put 9 drivers on the exclusion list.

How can we make this better? I propose supporting conditionals in the Gemfile language. For example:

condition :driver => 'sqlite' do
  gem 'sqlite3'
end

condition :driver => 'mysql' do
  gem 'mysql'
end

condition :driver => 'postgresql' do
  gem 'pg'
end

condition :driver => ['mysql', 'sqlite'] do
  gem 'foobar'
end

The following command would install the mysql and the foobar gems:

bundle install --condition driver=mysql

Bundler should enforce that the driver condition is set: if it’s not set then it should raise an error. To allow for the driver condition to not be set, the developer must explicitly define that the condition may be nil:

condition :driver => nil do
  gem 'null-database-driver'
end

Here, bundle install will install null-database-driver.

With this proposal, user installation instructions can be reduced to these steps:

  1. Edit database.yml and specify a driver.
  2. Run bundle install --condition driver=(driver name)

I’ve opened a ticket for this proposal. What do you think?


more »

My Summer of (Open) Source »

Created at: 05.01.2012 22:25, source: Engine Yard Blog, tagged: Open Source ruby Technology

The last few months have been an great experience for me. I’m a graduate student from Potsdam, Germany. However, as some of you might already know, I’m also rather active in the Ruby community. This past year, I had an amazing opportunity.

Engine Yard sponsors a couple of Open Source developers to work full time on their projects. When I asked Dr. Nic Williams whether they would sponsor me spending three months in Portland, working together with Brian Ford on Rubinius, I expected nothing but a no. Turns out, Engine Yard was at least as thrilled about this idea as I was. A few days ago, I finally got back to Germany, and I wanted to give you a quick overview of what I’ve been working on during my time overseas.Like many others, I started contributing to Rubinius a while ago. However, I never really dared to play with the internals. So, my first stop was the Rubinius compiler. To make sure I really understood it and that it’s as flexible as it claims to be, I wrote a Smalltalk implementation using the Rubinius compiler infrastructure and looked into improving its API.

It’s a fun thing to do, as the Rubinius compiler is written entirely in Ruby. And, since Rubinius is bootstrapped, it also runs on other Ruby implementations. That is how you usually install Rubinius: You load the compiler from CRuby, it then compiles the compiler to Rubinius bytecode. If you want to look into this, there is some excellent documentation available on the Rubinius website.

This bytecode can then be executed by the Virtual Machine, which was my next stop. It took me a while to fully understand how things work within the VM. It is actually the only major part of Rubinius not written in Ruby, and the main reason for it’s blazing performance and excellent memory footprint. I am planning ton writing another blog post, or possibly even a series of blog posts about these internals.

Apart from bug fixes and API improvements, I used the gained knowledge to fix, for instance, one of Ruby’s least known and most confusing feature: the implemented flip-flops.

The last thing I worked on was Puma, a new web server for Rails/Rack/Sinatra applications. Rubinius 2.0 is about to be released, fully able to make the best use of all your CPUs. However, most web servers used for deploying Ruby applications are actually single-threaded. Since there is no real threaded option that is still maintained and not JRuby specific, Evan Phoenix and I started working on a new server.

Like many other servers, it uses the rapid HTTP parser that comes with Mongrel. It also uses a dynamically sized thread-pool for processing requests in parallel. With Puma, you now have a go to choice when it comes to deploying web applications on Rubinius. And since it does not contain any Rubinius specific code, it also works quite well on JRuby or CRuby.

To make sure we are heading in the right direction, I started working on a tool for benchmarking web applications under realistic load. The main issue with just using ab, the standard solution for measuring HTTP performance, is that it results in unrealistic numbers both on JRuby and Rubinius. When using ab, you just send the same request over and over again, causing the JIT and code inliner to highly optimize for exactly that request. This usually doesn’t reflect the actual production behavior, though. I therefore wrote code simulating a real browser session and, of course, running multiple of these sessions in parallel.

You think that’s all? Far from it! The Engine Yard OSS Community Grant Program enabled me to speak at six different conferences all over America. At Rocky Mountain Ruby, RubyConf Brazil and RubyConf Uruguay, I gave a talk on “Real Time Rack”. In San Francisco, at GoGaRuCo, I gave a presentation about “Smalltalk On Rubinius - or How To Implement Your Own Programming Language”. At this past year’s RubyConf in New Orleans, I spoke about “Message in a Bottle” and last but not least I gave a presentation titled “Beyond Ruby” at RubyConf Argentina in Buenos Aires.


more »

rspec-rails-2.8.1 is released »

Created at: 05.01.2012 07:43, source: David Chelimsky, tagged: bdd rspec rails ruby

Bug fix release

The rails-3.2.0.rc2 release broke stub_model in rspec-rails-2.0.0 > 2.8.0. The rspec-rails-2.8.1 release fixes this issue, but it means that when you upgrade to rails-3.2.0.rc2 or greater, you’ll have to upgrade to rspec-rails-2.8.1 or greater.

Because rspec-rails-2.8.1 supports all versions of rails since 3.0, I recommend that you upgrade to rspec-rails-2.8.1 first, and then upgrade to rails-3.2.0.rc2 (or 3.2.0 once it’s out).

Changelog

http://rubydoc.info/gems/rspec-rails/file/Changelog.md

Docs

http://rubydoc.info/gems/rspec-rails
http://relishapp.com/rspec/rspec-rails


more »

RSpec-2.8 is released! »

Created at: 05.01.2012 04:38, source: David Chelimsky, tagged: bdd rspec ruby

We released RSpec-2.8.0 today with a host of new features and improvements since 2.7. Some of the highlights are described below, but you can see the full changelogs at:

Documentation

While not 100% complete yet, we’ve made great strides on RSpec’s RDoc:

http://rspec.info is now just a one pager (desperate for some design love - volunteers please email rspec-users@rubyforge.org). All the old pages are redirects to the relevant RDoc at http://rubydoc.info. RSpec-1 info is still available at http://old.rspec.info.

We’ve still got Cucumber features up at http://relishapp.com/rspec, but we’re going to be phasing that out as the primary source of documentation. There are a lot of reasons for this, and I’ll try to follow up with a separate blog post on this topic.

rspec-core

Improved support for tags and filtering

You can now set default tags/filters in either RSpec.configure or a .rspec file and override these tags on the command line. For example, this configuration tells rspec to run all the examples that are not tagged :slow:

# in spec/spec_helper.rb
RSpec.configure do |c|
  c.treat_symbols_as_metadata_keys_with_true_values = true
  c.filter_run_excluding :slow
end

Now when you want run those, you can just do this:

rspec --tag slow

This will override the configuration and run onlly the examples tagged :slow.

–order rand

We added an --order option with two supported values: rand and default.

rspec --order random (or rand) tells RSpec to run the groups in a random order, and then run the examples within each group in random order. We implemented it this way (rather than complete randomization of every example) because we don’t want to re-run expensive before(:all) hooks. A fair tradeoff, as the resulting randomization is just as effective at exposing order-dependency bugs.

When you use --order random, RSpec prints out the random number it used to seed the randomizer. When you think you’ve found an order-dependency bug, you can pass the seed along and the order will remain consistent:

--order rand:3455

--order default tells RSpec to load groups and examples as they are declared in each file.

rspec –init

We added an --init switch to the rspec command to generate a “spec” directory, and “.rspec” and “spec/spec_helper.rb” files with some starter code in them.

rspec-expectations

We discovered that the matcher DSL generates matchers that run considerably slower than classes which implement the matcher protocol. We made some minor improvements in the DSL, but to really improve things we re-implemented every single built-in matcher as a class.


more »

Special JRuby Release: 1.6.5.1 »

Created at: 28.12.2011 21:25, source: Engine Yard Blog, tagged: ruby Technology

For the Impatient

  1. JRuby 1.6.5.1 is a single patch release of JRuby 1.6.5 to fix CERT advisory: CERT-2011-003.  ALL USERS: PLEASE UPGRADE
  2. We talk about plans for the upcoming 1.6.6 release

CERT Details

Hashing 101

(For proper CSci vocabulary and a lot of fun details about hashing also read this wikipedia article)

Hash tables apply a math function (hashing function) to the key of a key-value pair. The result of the hashing function is a location to a hash bucket which stores the key/value pair internally:

a[:heh] = 1
hashing_function(:heh) -> store :heh/1 in hash bucket #3
a[:foo] = 2
hashing_function(:foo) -> store :foo/2 in hash bucket #13
a[:bar] = 3
hashing_function(:bar) -> store :bar/3 in hash bucket #1

Hashes have many buckets and in theory all key/value pairs added to a hash will get spread out evenly across the hashes buckets.  In practice, some number of keys will end up hashing into the same hash bucket (known as a hashing collision).  As you get more key/value pairs stored to the same hash bucket the time to access those particular key/value pairs will slow down.  This is because you need to walk some portion of the entries in the bucket to find the specific one you are looking for (hash structures will often make entries in an individual bucket a simple list structure).

a[:gar] = 4
hashing_function(:gar) -> store gar/4 in hash bucket #3 (same bucket as :heh)

In this example, accessing a[:gar] and a[:heh] may take longer than the other keys because they are sharing a hash bucket.

The Attack

The general application of the attack is for "the bad guys" to figure out a large set of values which will hash to the same hash bucket.  Once they create this list they will send all those values to a server.  The server will store them in a hash (think parameter list in Rack, for example).  The act of storing or accessing any of those values takes longer and longer as the number of entries in a single hash bucket grows.  The result will be a Denial Of Service (DOS) attack if enough values get stored.

hashing_function(:hostname) -> hash bucket #3
hashing_function(:aZ1) -> hash bucket #3
hashing_function(:cvg) -> hash bucket #3
hashing_function(:azr) -> hash bucket #3
... # many elided
hashing_function(:1fr) -> hash bucket #3
hashing_function(:yu3) -> hash bucket #3
hashing_function(:hyX) -> hash bucket #3
host = params[:hostname] # Uh oh! need to find this amongst many bucket buddies

The Fix

Adding a little bit of randomization to the hashing algorithm ends up making it much, much more difficult to figure out how to generate this type of attack.  JRuby 1.6.5.1 (and all later JRuby releases) all have this additional randomization built into the hashing algorithm.  The result should be decent hash bucket distribution that is difficult for attackers to predict.

More information

This vulnerability is not exclusively an issue of JRuby.  Other Ruby implementations also have a similar issue (also patched today).  In fact, Java and PHP also appear to be susceptible to this style of attack.  For more information, please see the CERT announcement.

Also, consider that language implementations are really only susceptible to this attack via frameworks which allow an external hacker to store arbitrary and/or unbounded key/values into a hash.  Ruby Rack had this vulnerability, but they have fixed things so that the amount of parameters stored is bounded by a size to remove the possibility of a DOS attack.  Rack users should upgrade to the latest version.

JRuby's First Security Fix-Only Release

We debated rolling what we have in our 1.6 branch along with the hashing vulnerability fix (mentioned above) and pushing out 1.6.6.  This was unappealing for a couple of reasons:

  1. For stable environments deployed using 1.6.5 we would be asking them to evaluate this security fix and any other fix we placed on JRuby 1.6 branch in the last two months.  This seems like it would force more conservative users to perform their own build to manually patch just the security fix.
  2. Of bugs we have fixed so far we felt we were about 10 short of what we wanted to have in JRuby 1.6.6

After consideration, we felt it best to give a security fix release now (A single security patch release JRuby 1.6.5.1 <--- update to this now please) to satisfy the cautious and to wait until we felt good about the quality of 1.6.6.  As they say, Open Source projects are ready when they are ready...

Hey! When will you be ready? What is missing?

It has been about two months since our last release and we suspect we can wrap things up in the next couple of weeks.  We plan on releasing JRuby 1.6.6 in mid-January.

As we have been saying all through the 1.6 series, we are primarily fixing 1.9 compatibility bugs.  Generally speaking, our 1.9 issue fixing has been dominated by encoding errors in Regexps, IO, and String.  Here is a list of what we have done so far.  It is also worth mentioned we fixed the regression which regressed Fiber (JRUBY-6170) in JRuby 1.6.5.  Also the dreaded missing 'read_nonblock' has been fixed (JRUBY-5529).

Here is the list of issues we are plan on settling for 1.6.6.  A few noteworthy mentions in this list is JRUBY-5657 (new 1.9 splat behavior), JRUBY- (new 1.9 to_ary behavior), and JRUBY-6067 (Windows YAML issue).

If there is some issue we don't have targetted but you think is drop-dead important then please let us know...We are willing to expedite other issues if presented with a reasonable case for why it should be fixed.  Please join the discussion.


more »