Routing with Ruby & ZeroMQ Devices »
Created at: 17.11.2010 20:18, source: igvita.com, tagged: Architecture rabbitmq zeromq
ZeroMQ sockets provide message-oriented messaging, support for multiple transports, transparent setup and teardown, and an entire array of routing patterns via different socket types - see quick refresher on ZeroMQ. The combination of these features, as well as the fact that we can bind or connect a single socket to multiple endpoints is what makes ZeroMQ "topology aware". We can encode and construct varying routing strategies at will, without the requirement for external routers, or message brokers.
Building a multi-threaded application? With ZeroMQ you can assemble a high-performance, in-process pub-sub fanout, a worker pipeline, or a custom worfklow combining the two with just a few lines of code. In fact, if you work with ZeroMQ you will quickly find yourself reusing similar sets of patterns. That's where ZeroMQ devices come in. The library ships with support for a few common routing patterns (queue, streamer, and forwarder), and also defines a ZDCF JSON specification to simplify their assembly.
Routing Devices: Queue, Streamer, Forwarder
If RabbitMQ is a specific message-broker implementation that codifies a set of allowed messaging patterns, then ZeroMQ is the low-level toolkit, which allows us to assemble these patterns at will. This flexibility eliminates the need for explicit message routers within our infrastructure, allowing us to assemble dynamic, P2P routing topologies. However, this also does not mean that the standalone message broker does not have its place within our infrastructure. In fact, in practice you want the flexibility of both: sometimes you want to run the router within the same runtime, and at other times, you may want to run it as a standalone process.
ZeroMQ ships with several devices out of the box which implement the most common routing patterns. ZMQ Queue is a simple forwarding device for request-reply messaging, which can allow you to aggregate requests from many requesters, and distribute them to one or more reply sockets - a chokepoint. ZMQ Streamer serves a similar function, but is designed to handle the pipeline pattern instead. And last but not least, is the ZMQ Forwarder, which allows you to aggregate data from multiple publishers, and distribute these messages via a fanout to all the connected subscribers.
Getting started with ZMQ Devices
The above patterns are simple but powerful, and best of all you can run them in memory, or as standalone processes! When you build and install ZeroMQ, you will find three ZMQ binaries on your system: zmq_forwarder, zmq_queue, and zmq_streamer. Simply pass them an XML file outlining your socket configuration, and you are ready to go.
Where would we use a standalone broker? Imagine a pub-sub scenario with multiple data centers. Each remote subscriber could bind directly to the publisher, but that would also mean a lot of expensive network hops and many open connections. Instead, we can run a pub-sub forwarding device (ZMQ Streamer) in each data center, which would bind to the remote socket, stream the messages to the local data center and make them available to the local subscribers!
Assembling Custom ZMQ Devices
While the devices shipped with ZeroMQ cover some of the most common use cases, the real power of the framework is in its flexibility to allow us to assemble arbitrary messaging patterns: mixing different socket types, defining custom routing strategies based on meta-data of request, and so on. This is where ZeroMQ’s ZDCF configuration syntax, and a Ruby DSL can do wonders:
d = ZMQ::Device::Builder.new( context: { iothreads: 2 }, main: { type: :queue, frontend: { type: :SUB, option: { swap: 25 }, connect: ["tcp://127.0.0.1:5555"]}, backend: { type: :PUB, bind: ["tcp://127.0.0.1:5556"] } } ) d.main.start do loop do msg = ZMQ::Message.new frontend.recv msg backend.send msg end end
In the example above, we used ZDCF’s JSON configuration syntax to open all the socket connections, and then defined a simple relay routing strategy in our start block. Of course, you could also implement much more interesting patterns: open more than two sockets, examine the contents of the message, relay the message to select endpoints, and so on! Grab the DSL, and give it a try.
RabbitMQ and ZeroMQ work together
It is easy to mistake RabbitMQ and ZeroMQ for competitors. In reality, their features are complimentary, and the two can work together in very helpful ways! For example, by installing the rmq-0mq plugin into your RabbitMQ instance, you can effectively turn RabbitMQ into a ZeroMQ device!
RabbitMQ has proven to be an incredibly stable, high-performance message broker with support for persistence and a variety of routing topologies. So, if you are looking for a ZeroMQ device capable of handling thousands of connections, with support for persistence, and a number of other features, then the combination of ZeroMQ and RabbitMQ is definitely worth a close look.
more »
ZeroMQ: Modern & Fast Networking Stack »
Created at: 03.09.2010 20:13, source: igvita.com, tagged: Architecture sockets zeromq
Berkeley Sockets (BSD) are the de facto API for all network communication. With roots from the early 1980's, it is the original implementation of the TCP/IP suite, and arguably one of the most widely supported and critical components of any operating system today. BSD sockets that most of us are familiar with are peer-to-peer connections, which require explicit setup, teardown, choice of transport (TCP, UDP), error handling, and so on. And once you solve all of the above, then you are into the world of application protocols (ex: HTTP), which require additional framing, buffering and processing logic. In other words, it is no wonder that a high-performance network application is anything but trivial to write.
Wouldn't it be nice if we could abstract some of the low-level details of different socket types, connection handling, framing, or even routing? This is exactly where the ZeroMQ (ØMQ/ZMQ) networking library comes in: "it gives you sockets that carry whole messages across various transports like inproc, IPC, TCP, and multicast; you can connect sockets N-to-N with patterns like fanout, pubsub, task distribution, and request-reply". That's a lot buzzwords, so lets dissect some of these concepts in more detail.
Message-Oriented vs. Streams & Datagrams
ZeroMQ sockets provide a layer of abstraction on top of the traditional socket API, which allows it to hide much of the everyday boilerplate complexity we are forced to repeat in our applications. To begin, instead of being stream (TCP), or datagram (UDP) oriented, ZeroMQ communication is message-oriented. This means that if a client socket sends a 150kb message, then the server socket will receive a complete, identical message on the other end without having to implement any explicit buffering or framing. Of course, we could still implement a streaming interface, but doing so would require an explicit application-level protocol.
# create zeromq request / reply socket pair ctx = ZMQ::Context.new req = ctx.socket ZMQ::REQ rep = ctx.socket ZMQ::REP # connect sockets: notice that reply can connect first even with no server! rep.connect('tcp://127.0.0.1:5555') req.bind('tcp://127.0.0.1:5555') req.send ZMQ::Message.new('hello' * (1024*1024)) msg = ZMQ::Message.new rep.recv(msg) msg.copy_out_string.size # => 5242880
Switching from a streaming/datagram to a message-oriented model is seemingly a minor change, but one that carries a lot of implications. Because ZeroMQ will handle all of the buffering and framing for you, the client and server applications become an order of magnitude simpler, more secure, and much easier to write.
Transport Agnostic Sockets
ZeroMQ sockets are also transport agnostic: there is a single, unified API for sending and receiving messages across all protocols. By default, there is support for in-process, IPC, multicast, and TCP, and switching between all of them is as simple as changing the prefix on your connection string. This means we can start with IPC for fast local communication, and then switch to TCP at any point for distributed cases with minimal effort. As an added benefit, ZeroMQ handles all connection setup, teardown, and reconnect logic under the hood. That's about as simple as it gets.
Routing & Topology Aware Sockets
ZeroMQ sockets are routing and network topology aware. Since we don't have to explicitly manage the peer-to-peer connection state - all of that is abstracted by the library, as we saw above - nothing stops a single ZeroMQ socket from binding to two distinct ports to listen to for inbound requests, or in reverse, send data to two distinct sockets via a single API call. How does ZeroMQ know who to listen to or push data to? That depends on the type of the socket pair we pick for our application: Request/Reply, Publish/Subscribe, Pipeline, and Pair (alpha).
ctx = ZMQ::Context.new # create publisher socket, and publish to two pipes! pub = ctx.socket(ZMQ::PUB) pub.bind('tcp://127.0.0.1:5000') pub.bind('inproc://some.pipe') # generate random message, ex: '1 9' Thread.new { loop { pub.send [rand(2), rand(10)].join(' ') } } # create a consumer, and listen for messages whose key is '1' sub = ctx.socket(ZMQ::SUB) sub.connect('inproc://some.pipe') sub.setsockopt(ZMQ::SUBSCRIBE, '1') loop { p sub.recv } # => "1 9" ...
In the case of a Publish/Subscribe socket pair (unidirectional communication from publisher to subscribers), the publisher socket will replicate the message to all connected clients (local IPC clients, remote TCP listeners, etc). In the case of a Request/Reply socket pair (bi-directional communication: server, client), the messages will be automatically load balanced by the socket generating the request to one of the connected clients. Finally, a Push/Pull socket pair (pipeline: unidirectional, load-balanced) will allow you to simulate a staged message passing architecture with built-in load balancing.
ZeroMQ allows us to encode the topology of our services directly via the socket API, without having to define and maintain a separate coordination layer of routers, load balancers, and message brokers. Of course, nothing stops us from using any of these tools in combination with ZeroMQ, but in many cases, the ZeroMQ route can yield better performance and much simpler operational complexity.
ZeroMQ under the hood
By default, all communication in ZeroMQ is done in asynchronous fashion. To enable this, anytime you create an application with ZeroMQ, you will have to explicitly declare the number of background I/O threads - in most cases, a single dedicated I/O thread will suffice. All of the thread logic is handled by the C++ core of the library itself, but it does mean that at very minimum, your application will have two scheduled threads.
This asynchronous processing model allows ZeroMQ to abstract all connection setup, teardown, reconnect logic, and also to minimize message delivery latency: no blocking means the messages can be dispatched, delivered and queued (sender or receiver side) in parallel to the regular processing done by your application. Of course, you can also control the queuing behavior of ZeroMQ sockets by setting an allowed memory bound and even a swap size for each socket. Hence, you can simulate the blocking API if desired, but asynchronous I/O is the default. Combined with zero copy semantics, optimized framing, and no locking data structures, the end result is a high performance and throughput oriented messaging middleware with a modern API.
ZeroMQ in the Wild: Mongrel 2
Mongrel2 offers an interesting case-study of applying ZeroMQ to the world of web-servers: all inbound requests are routed by Mongrel2 via a "Push" socket which automatically load-balances the requests to connected handlers. The handlers, in turn, process the incoming requests (via Pull socket) and publish them to a "Pub" socket, to which the Mongrel2 server itself is subscribed to and is listening for its process ID (via a topic filter).
Hence, the processing is not tied to a simple request-response cycle we are commonly used to, where a single backend has to handle the full request start to finish. Instead, we can setup several processing stages (via pipeline pattern), and emit our reply only after it is processed by all stages.
Ambitious and worth exploring
Needless to say, ZeroMQ is an ambitious project, and this short introduction only scratches the surface of the full feature set. The stated goal of ZeroMQ is to "become part of the standard networking stack, and then the Linux kernel". Whether they will succeed, remains to be seen, but it is definitely a very promising and arguably a much needed layer of abstraction on top of the "traditional" BSD sockets. ZeroMQ makes writing high performance networking applications incredibly easy and fun.
The best way to get started with ZeroMQ is to work through some hands-on examples - the concepts are not new, but the ease with which you can compose them takes some getting use to. For Rubyists, Andrew Cholakian has put together a great set of examples to get you started (check out dripdrop as well), and for everyone else, head to the ZeroMQ site, grab your language bindings and dive into the code.
more »
