Slides from MWRC 2010 »
Created at: 12.03.2010 22:07, source: time to bleed by Joe Damato, tagged: linux ruby scaling systems x86 garbage collection GC performance x86_64
Download as PDF (40mb)
EventMachine: scalable non-blocking i/o in ruby
Download as PDF (3mb)
Descent into Darkness: Understanding your system’s binary interface is the only way out.
more »
Install Communigate Pro on Ubuntu Hardy Heron »
Created at: 27.01.2010 23:01, source: Hackido, tagged: linux ubuntu
This is another blog entry from my old site that I'm going to keep around just in case. Will probably write this one up again once a new LTS is out.
I love free software. But, there is something to be said for paid support. When it comes to email, sometimes a robust mail server is just what the doctor ordered. Communigate Pro has been doing the job for large scale ISPs for a while now so back when one of my clients needed a solution I didn't hesitate to recommend it. I've no regrets except for one: the initial install was on a CentOS machine. Nothing against CentOS or Red Hat, but I prefer Debian based distros. And while the CG Pro mail server install instructions exist for Windows, MacOS X, FreeBSD, and many Linux distros, Ubuntu is currently not one of them. Fortunately it turns out it's not that bad to get it going. Here's what you need to do.
Step 1 - Update and upgrade: I bet you didn't see that coming?
sudo apt-get update
sudo apt-get dist-upgrade
Step 2 - Grab needed tools and use 'em: In our case we'll only need alien, an application that converts RPM files into DEB files.
sudo apt-get install alien
sudo alien CGatePro-Linux-5.0-14.x86_64.rpm
As you can see in our example above, I'm assuming that you've downloaded the RPM from Communigate. In my case I'm running version 5.0-14. Your version may vary.
Step 3 - Install: The process for install a deb file is very easy just type this in, replacing the name of the deb with what you generated in Step 2.
sudo dpkg -i cgatepro-linux_5.0-15_amd64.deb
Step 4 - Modify files At this point CG Pro is installed and it will work. But when you first launch it you'll get a bunch of errors like ulimit: 43: Illegal option -u and librt.so.1: cannot open shared object file: No such file or directory. These aren't deal breakers but we should fix them anyway. To do that, pop open your favorite text editor and modify the /etc/init.d/Communigate file. Here are the changes you want to make:
- Change the first line from #!/bin/sh to #!/bin/bash
- Change the assumed kernel line to use 2.6.16
- Change all instances of /var/lock/susbsys/Communigate to /var/lock/Communigate
Step 5 - Launch and default run on boot With that done you'll need to start Communigate, make sure there are no errors and then set it to boot in case your server needs to be restarted.
sudo /etc/init.d Communigate start
sudo update-rc.d Communigate defaults
When you restart your machine, just check that Communigate Pro is running using ps -aux | grep CGServer. Hopefully you'll see all the spawned daemons. There are of course plenty of authorized CG resellers and Stalker themselves gives great support so if something didn't go according to plan you should contact somebody there. Of course my employer could also provide paid support if you need it. :-)
more »
Double Shot #634 »
Created at: 26.01.2010 12:32, source: A Fresh Cup, tagged: Double Shot firefox jquery linux rails
Yesterday's catches in the link sea...
- VPS Performance Comparison - Slicehost, Linode, Prgmr, Rackspace, Amazon. Linode is the clear winner in this one.
- Linux performance basics - I really need to learn more of this stuff.
- Firefox 3.6 Tips and Tweak - Tab previews, change the open link behavior, a few more. jWizard - After looking at a few jQuery wizard frameworks we settled on this one. See also documentation and demos.
- Introducing the Dirty Associations Plugin - Interesting idea, not so sure about the name.
more »
String together global offset tables to build a Ruby memory profiler »
Created at: 25.01.2010 14:59, source: time to bleed by Joe Damato, tagged: debugging linux ruby systems x86 debug memory profiling x86_64

If you enjoy this article, subscribe (via RSS or e-mail) and follow me on twitter.
Disclaimer
The tricks, techniques, and ugly hacks in this article are PLATFORM SPECIFIC, DANGEROUS, and NOT PORTABLE.
This is the third article in a series of articles describing a set of low level hacks that I used to create memprof a Ruby level memory profiler. You should be able to survive without reading the other articles in this series, but you can check them out here and here.
How is this different from the other hooking articles/techniques?
The previous articles explained how to insert trampolines in the .text segment of a binary. This article explains a cool technique for hooking functions in the .text segment of shared libraries, allowing your handler to run, and then resuming execution. Hooking shared libraries turns out to be less work than hooking the binary (in the case of Ruby, that is), but making it all happen was a bit tricky. Read on to learn more.
The “problem” with shared libraries
The problem is that if a trampoline is inserted into the code of the shared library, the trampoline will need to invoke the dynamic linker to resolve the function that is being hooked, call the function, do whatever additional logic is desired, and then resume execution.
In other words you need to (somehow) insert a trampoline for a function that will call the function being trampolined without ending up in an infinite loop.
The additional complexity occurs because when shared libraries are loaded, the kernel decides at runtime where exactly in memory the library should be loaded. Since the exact location of symbols is not known at link time, a procedure linkage table (.plt) is created so that the program and the dynamic linker can work together to resolve symbol addresses.
I explained how .plts work in a previous article, but looking at this again is worthwhile. I’ve simplified the explanation a bit1, but at a high level:
- Program calls a function in a shared object, the link editor makes sure that the program jumps to a stub function in the
.plt - The program sets some data up for the dynamic linker and then hands control over to it.
- The dynamic linker looks at the info set up by the program and fills in the absolute address of the function that was called in the
.pltin the global offset table (.got). - Then the dynamic linker calls the function.
- Subsequent calls to the same function jump to the same stub in the
.plt, but every time after the first call the absolute address is already in the.got(because when the dynamic linker is invoked the first time, it fills in the absolute address in the.got).
Disassembling a short Ruby VM function that calls rb_newobj (a memory allocation routine that we’d like to hook), shows the calls to the .plt:
000000000001af10: . . . . 1af14: e8 e7 c6 ff ff callq 17600 [rb_newobj@plt] . . . .
Let’s take a look at the corresponding .plt stub:
0000000000017600: 17600: ff 25 6a 9c 2c 00 jmpq *0x2c9c6a(%rip) # 2e1270 [_GLOBAL_OFFSET_TABLE_+0x288] 17606: 68 4e 00 00 00 pushq $0x4e 1760b: e9 00 fb ff ff jmpq 17110 <_init+0x18>
Important fact: The program and each shared library has its own .plt and .got sections (amongst other sections). Keep this in mind as it’ll be handy very shortly.
That is a lot of stub code to reproduce in the trampoline. Reproducing that stuff in the trampoline shouldn’t be hard, but invites a large number of bugs over to play. Is there a better way?
What is a global offset table (.got)?
The global offset table (.got) is a table of absolute addresses that can be filled in at runtime. In the assembly dump above, the .got entry for rb_newobj is referenced in the .plt stub code.
Intercepting a function call
It would be awesome if it were possible to overwrite the .got entry for rb_newobj and insert the address of a trampoline. But how would the intercepting function call rb_newobj itself without ending up in an infinite loop?
The important fact above comes in to save the day.
Since each shared object has its own .plt and .got sections, it is possible to overwrite the .got entry for rb_newobj in every shared object except for the object where the trampoline lives. Then, when rb_newobj is called, the .plt entry will redirect execution to the trampoline. The trampoline then calls out to its .plt entry for rb_newobj which is left untouched allowing rb_newobj to be resolved and called out to successfully.
Not as easy as it sounds, though
This solution is less work than the other hooking methods, but it has its own particular details as well:
- You’ll need to walk the link map at runtime to determine the base address for the shared library you are hooking (it could be anywhere).
- Next, you’ll need to parse the
.rela.pltsection which contains information on the location of each.pltstub, relative to the base address of the shared object. - Once you have the address of the
.pltstub, you’ll need to determine the absolute address of the.gotentry by parsing the first instruction of the.pltstub (ajmp) as seen in the disassembly above. - Finally, you can write to the
.gotentry the address of your trampoline, as long as the trampoline lives in a different shared library.
You’ve now successfully managed to poison the .got entry of a symbol in one shared library to direct execution to your own function which can then call the intercepted function itself without getting stuck in an infinite loop.
Conclusion
- There are lots of sections in each ELF object. Each section is special and important.
- ELF documentation can be difficult to obtain and understand.
- Got pretty lucky this time around. I was getting a little worried that it would get complicated. Made it out alive, though.
Thanks for reading and don’t forget to subscribe (via RSS or e-mail) and follow me on twitter.
References
more »
memprof: A Ruby level memory profiler »
Created at: 11.12.2009 14:59, source: time to bleed by Joe Damato, tagged: bugfix debugging linux monitoring ruby systems x86 debug garbage collection GC memory performance profiling system health x86_64

If you enjoy this article, subscribe (via RSS or e-mail) and follow me on twitter.
What is memprof and why do I care?
memprof is a Ruby gem which supplies memory profiler functionality similar to bleak_house without patching the Ruby VM. You just install the gem, call a function or two, and off you go.
Where do I get it?
memprof is available on gemcutter, so you can just:
gem install memprof
Feel free to browse the source code at: http://github.com/ice799/memprof.
How do I use it?
Using memprof is simple. Before we look at some examples, let me explain more precisely what memprof is measuring.
memprof is measuring the number of objects created and not destroyed during a segment of Ruby code. The ideal use case for memprof is to show you where objects that do not get destroyed are being created:
- Objects are created and not destroyed when you create new classes. This is a good thing.
- Sometimes garbage objects sit around until
garbage_collecthas had a chance to run. These objects will go away. - Yet in other cases you might be holding a reference to a large chain of objects without knowing it. Until you remove this reference, the entire chain of objects will remain in memory taking up space.
memprof will show objects created in all cases listed above.
OK, now Let’s take a look at two examples and their output.
A simple program with an obvious memory “leak”:
require 'memprof'
@blah = Hash.new([])
Memprof.start
100.times {
@blah[1] << "aaaaa"
}
1000.times {
@blah[2] << "bbbbb"
}
Memprof.stats
Memprof.stop
This program creates 1100 objects which are not destroyed during the start and stop sections of the file because references are held for each object created.
Let's look at the output from memprof:
1000 test.rb:11:String
100 test.rb:7:String
In this example memprof shows the 1100 created, broken up by file, line number, and type.
Let's take a look at another example:
require 'memprof' Memprof.start require "stringio" StringIO.new Memprof.stats
This simple program is measuring the number of objects created when requiring stringio.
Let's take a look at the output:
108 /custom/ree/lib/ruby/1.8/x86_64-linux/stringio.so:0:__node__
14 test2.rb:3:String
2 /custom/ree/lib/ruby/1.8/x86_64-linux/stringio.so:0:Class
1 test2.rb:4:StringIO
1 test2.rb:4:String
1 test2.rb:3:Array
1 /custom/ree/lib/ruby/1.8/x86_64-linux/stringio.so:0:Enumerable
This output shows an internal Ruby interpreter type __node__ was created (these represent code), as well as a few Strings and other objects. Some of these objects are just garbage objects which haven't had a chance to be recycled yet.
What if nudge the garbage_collector along a little bit just for our example? Let's add the following two lines of code to our previous example:
GC.start Memprof.stats
We're now nudging the garbage collector and outputting memprof stats information again. This should show fewer objects, as the garbage collector will recycle some of the garbage objects:
108 /custom/ree/lib/ruby/1.8/x86_64-linux/stringio.so:0:__node__
2 test2.rb:3:String
2 /custom/ree/lib/ruby/1.8/x86_64-linux/stringio.so:0:Class
1 /custom/ree/lib/ruby/1.8/x86_64-linux/stringio.so:0:Enumerable
As you can see above, a few Strings and other objects went away after the garbage collector ran.
Which Rubies and systems are supported?
- Only unstripped binaries are supported. To determine if your Ruby binary is stripped, simply run:
file `which ruby`. If it is, consult your package manager's documentation. Most Linux distributions offer a package with an unstripped Ruby binary. - Only x86_64 is supported at this time. Hopefully, I'll have time to add support for i386/i686 in the immediate future.
- Linux Ruby Enterprise Edition (1.8.6 and 1.8.7) is supported.
- Linux MRI Ruby 1.8.6 and 1.8.7 built with --disable-shared are supported. Support for --enable-shared binaries is coming soon.
- Snow Leopard support is experimental at this time.
- Ruby 1.9 support coming soon.
How does it work?
If you've been reading my blog over the last week or so, you'd have noticed two previous blog posts (here and here) that describe some tricks I came up with for modifying a running binary image in memory.
memprof is a combination of all those tricks and other hacks to allow memory profiling in Ruby without the need for custom patches to the Ruby VM. You simply require the gem and off you go.
memprof works by inserting trampolines on object allocation and deallocation routines. It gathers metadata about the objects and outputs this information when the stats method is called.
What else is planned?
Myself, Jake Douglas, and Aman Gupta have lots of interesting ideas for new features. We don't want to ruin the surprise, but stay tuned. More cool stuff coming really soon :)
Thanks for reading and don't forget to subscribe (via RSS or e-mail) and follow me on twitter.
more »
