Consuming XMPP PubSub in Ruby »

Created at: 10.11.2009 19:22, source: igvita.com, tagged: Architecture pubsub realtime xmpp

XMPP is a very versatile protocol with well over several hundred proposed and working extensions, which has also proven itself in production (ex: Google Talk). Presence, roster management, federated and server to server (S2S) messaging are all examples of features that you get for free, which make it a very appealing platform for messaging applications. Combine it with extensions such as XEP-0060 (PubSub), and we have all the relevant buzzwords: pubsub, real-time, federated, and presence.

The PubSub specification within XMPP, as defined in XEP-0060, is definitely not as flexible as that of AMQP, but it is often times enough to cover the most popular use cases. However, technical merits aside, one of the key missing components, especially in Ruby, has been the historical lack of functioning libraries - xmpp4r claims to support it, but examples are lacking. Thankfully, after test driving the latest batch of gems, it looks like we're finally there.

Getting off the ground with XMPP

Without a good toolkit XMPP can be a gnarly protocol to get started with - Pidgin IM client has some great tools for spying on the exchange, but monitoring pages of XML scroll by can only get you so far. Thankfully, Seth Fitzsimmons has built switchboard ("curl for XMPP"), which offers a powerful command line tool to greatly simplify the process. Make sure to read the full tutorial, or jump right into it by testing it with the Wordpress XMPP stream:

# list available options, subscribe to a blog, list subscriptions and then open the stream
switchboard disco --target pubsub.im.wordpress.com info
switchboard pubsub --server pubsub.im.wordpress.com --node /blog/icanhazcheesburger.com subscribe
switchboard pubsub --server pubsub.im.wordpress.com subscriptions
switchboard pubsub --server pubsub.im.wordpress.com listen

Based on xmpp4r, switchboard is also a toolkit for assembling your own XMPP clients, which means that it can be easily customized to power a PubSub consumer. From start to finish, and since examples are still hard to come by:

> switchboard-pubsub.rb

require 'rubygems'
require 'switchboard'
 
class WordpressJack
  def self.connect(switchboard, settings)
    switchboard.plug!(PubSubJack)
    switchboard.hook(:post)
 
    switchboard.on_pubsub_event do |event|
      event.payload.each do |payload|
        payload.elements.each do |item|
          on(:post, item)
        end
      end
    end
  end
end
 
settings = Switchboard::Settings.new
settings['pubsub.server'] = 'pubsub.im.wordpress.com'
settings['jid'] = 'user@im.wordpress.com'
settings['password'] = 'password'
 
switchboard = Switchboard::Client.new(settings)
switchboard.plug!(WordpressJack)
 
switchboard.on_post do |post|
  puts "A new post was received:"
  puts post.methods.sort.uniq
  exit
end
 
switchboard.run!
 

XMPP with EventMachine and Nokogiri

If you have an EventMachine stack, or looking for a high performance library, Jeff Smick's blather is definitely a gem to investigate. The combination of the asynchronous nature of EventMachine, a SAX parser within Nokogiri, and a great DSL make it very fast and a pleasure to work with:

> blather-pubsub.rb

require 'rubygems'
require 'blather/client/client'
require 'blather/client/dsl/pubsub'
require 'blather'
 
EventMachine.run {
  host = 'pubsub.im.wordpress.com'
  node = 'blog/icanhazcheesburger.com'
  user = 'user@im.wordpress.com'
  pass = 'pass'
 
  jid = Blather::JID.new(user)
  client = Blather::Client.setup(jid, pass)
  client.register_handler(:ready) {
    puts "Connected. Send messages to #{client.jid.inspect}."
    pub = Blather::DSL::PubSub.new(client, host)
  }
 
  client.register_handler(:pubsub_event) { |event|
    puts event
  }
 
  client.connect
}

PubSub & Event-Driven Architecture

Having personally struggled in the past with XMPP PubSub and Ruby, it's been great to revisit the use case and find a new set of fully functional libraries. The event driven architecture which is enabled by technologies such as XMPP, AMQP, Comet, Webhooks and PubsubHubbub are increasingly becoming the staple of many web applications, and for a good reason. If you haven't already, grab switchboard or blather and take XMPP for a test drive.


more »

Advanced Messaging & Routing with AMQP »

Created at: 08.10.2009 19:34, source: igvita.com, tagged: Architecture ruby amqp pubsub

Not all message queues are made equal. In the simplest case, a message queue is synonymous with an asynchronous protocol in which the sender and the receiver do not operate on the message at the same time. However, while this pattern is most commonly used to decouple distinct services (an intermediate mailbox, of sorts), the more advanced implementations also enable a host more advanced recipes: load balancing, queueing, failover, pubsub, etc. AMQP can do all of the above, and yesterday's announcement of RabbitMQ 1.7 (an open source AMQP broker) warrants a closer look.

Originally developed at JP Morgan as a vendor neutral wire and broker protocol, AMQP (Advanced Message Queuing Protocol) is, in fact, a general purpose messaging bus. The protocol itself is still under active development, but there are a variety of open source client and server implementations for it, as well as some big commercial supporters (RedHat, Microsoft, etc). In other words, it works, it is production ready, and I can vouch for it from personal experience - we stream tens of millions of messages through AMQP at PostRank on a daily basis.

AMQP vs XMPP: Features & Architecture

The AMQP vs XMPP debate has been raging for years now. On the surface they both look identical, but in reality there are a number of important distinctions. For example, presence is one of the central components of XMPP, but it is not part of the AMQP specification. XMPP uses XML, whereas AMQP has a binary protocol. AMQP has native support for a number of delivery use cases (at least once, exactly once, select subscribers, persistence, etc) and also a variety of exchange implementations which allow fine-grained control to where and how the messages are routed.

The AMQP spec is a fast and recommended read, but by a way of quick introduction, the core architectural components are: publisher, exchange, queue, and consumer. As you may have guessed, the publisher is the data producer which pushes messages to an exchange. Why is it called an exchange? Because the exchange is a routing engine which is responsible for delivering the messages to the right queues (exchanges never store messages). For example, a message may need to be routed to just a single queue (direct exchange), maybe the message should be forwarded to every queue (pubsub) in the list (fanout exchange), or perhaps the message should be routed based on a key (topic exchange).

Publishing & Consuming AMQP in Ruby

The type of exchange, message parameters, and the name of the attached queue can all contribute to the delivery and routing behavior of the message. However, for the sake of example, let's create a simple pubsub fanout exchange in Ruby:

> amqp-publisher.rb

require 'amqp'
 
AMQP.start(:host => 'localhost') do
  # create a fanout exchange on the broker
  exchange = MQ.new.fanout('multicast')
 
  # publish multiple messages to fanout
  exchange.publish('hello')
  exchange.publish('world')
end
 

In order to consume the messages from an exchange the consumer needs to create a queue and then bind it to an exchange. A queue can be durable (survive between server restarts), or auto-deletable for cases when the queue should disappear if the consumer goes down. Best of all, once the queue is bound to an exchange, the messages are streamed to the client in real-time via a persistent connection (no polling!):

> amqp-consumer.rb

require 'amqp'
 
AMQP.start(:host => 'localhost') do
  amq = MQ.new
 
  # bind 'listener' queue to 'multicast' exchange
  amq.queue('listener').bind(amq.fanout('multicast')).subscribe do |msg|
    puts msg # process your message here
  end
end
 

Advanced AMQP Recipes

The flexibility of the message and the exchange model is what makes AMQP such a powerful tool. Whenever a publisher generates a message, he can mark it as 'persistent' which will guarantee delivery through the broker - if there is an attached queue, it will accumulate messages until the consumer requests them. However, if you're streaming transient data (access logs, for example), you can also disable message persistence and not worry about overwhelming your broker. That's how you achieve 'exactly-once' vs 'at least once' semantics.

Trying to build a pubsub hub? Create a fanout exchange and attach as many queues as you want, each consumer will receive a copy of the message. Load balancing? Bind two workers to the same queue and the broker will automatically round-robin the messages (there is no upper limit on the number of workers). Failover? By default the AMQP broker does not require a message to be ACKed by a consumer, but with a simple configuration flag the messages will be kept on the server until the ACK is received. If the consumer goes down without ACKing a message, they will be automatically put back on the queue. Need to route a message based on a key? Topic exchange allows partial matching based on a message key that is set by the producer. Do you want to notify the producer if there are no subscribers attached to a queue? Set the immediate flag on the message and the broker will do all the work.

Best of all, you can also compose these patterns to cover virtually any delivery use case!

AMQP Brokers & Ruby / Rails

There are a variety of available broker implementations: ZeroMQ, ActiveMQ, OpenAMQ, and RabbitMQ. Because the underlying protocol is still in flux, there is definitely some variation between all the implementations - do your homework. If you're looking for a speed demon, ZeroMQ claims a 15.4 microsend routing overhead (4+ million msgs/s). However, RabbitMQ is arguably the most stable and feature complete broker implementation. If you are a CentOS or a Fedora user, you'll be happy to know that it is now part of the distro (yum install rabbitmq-server), otherwise follow the installation instructions.

Once the server is installed, follow the administration guide to start the broker. If you're looking for a RESTful or a GUI tool to help you configure the broker, drop in Alice on the same server. Like the SQL prompt? Install the BQL plugin and familiarize yourself with the syntax.

The AMQP gem is probably the best choice when it comes to Ruby clients - it is asynchronous, it is fast, and it is in use by dozens of companies. If you're looking for a synchronous client, Carrot gem is the answer. If you're using async-observer plugin in your Rails projects, you can drop in async-observer-amqp to migrate to AMQP. In other words, it is easy to get started, it is incredibly powerful, and it has great library support for virtually every language. Give it a try.


more »