Monitoring the JVM Heap with JRuby »

Created at: 29.04.2010 20:00, source: Engine Yard Blog, tagged: Technology jruby JVM

The JVM provides a number of management and monitoring interfaces that you, as a JRuby user, can use to keep an eye on or profile your application. There are interfaces for monitoring threads, monitoring memory usage, monitoring loaded JVM classes, and even for monitoring JRuby internals. The full set of built-in monitors are documented in the JDK’s java.lang.management docs.

In this article, we’ll explore some of the memory monitoring beans and how you can use them to analyze the memory characteristics of your application. I’ll be using the default Java 6 version for OS X 10.6.3, which is based on Hotspot, the JVM inside OpenJDK.

We start by requiring ‘java’ and importing the ManagementFactory from the JDK.

require 'java'
 
import java.lang.management.ManagementFactory

Now we have access to the four primary beans associated with memory management on the JVM: GarbageCollector, MemoryManager, Memory, and MemoryPool MXBeans. We’ll grab them all first.

gc_beans = ManagementFactory.garbage_collector_mxbeans
memman_beans = ManagementFactory.memory_manager_mxbeans
mem_bean = ManagementFactory.memory_mxbean
mempool_beans = ManagementFactory.memory_pool_mxbeans

Let’s start with the simplest of these beans, the Memory bean.

The Memory bean provides the most basic information: current heap usage, current non-heap usage, and the count of objects waiting to be finalized. There’s also methods to force a GC run and to enable verbose GC logging. Let’s check current stats, run some code with verbose logging on, and force a GC run:

puts "Current heap usage: #{mem_bean.heap_memory_usage}"
puts "Current non-heap usage: #{mem_bean.non_heap_memory_usage}"
puts "Waiting to finalize: #{mem_bean.object_pending_finalization_count}"
 
puts "Enabling verbose GC logging..."
mem_bean.verbose = true
1_000_000.times { "foo" + "bar" }
puts "Forcing a full GC run..."
mem_bean.gc
mem_bean.verbose = false
puts "Disabling verbose GC logging"
puts

Here’s the output so far:

Current heap usage: init = 268435456(262144K) used = 1960416(1914K) committed = 270073856(263744K) max = 520093696(507904K)
Current non-heap usage: init = 12750848(12452K) used = 13608880(13289K) committed = 21921792(21408K) max = 100663296(98304K)
Waiting to finalize: 0
Enabling verbose GC logging...
[GC 17029K->1642K(263744K), 0.0014821 secs]
[GC 17066K->1653K(263744K), 0.0122954 secs]
[GC 17077K->1654K(263744K), 0.0012559 secs]
[GC 17078K->1654K(263744K), 0.0006338 secs]
[GC 17078K->1658K(263744K), 0.0050983 secs]
[GC 17082K->1646K(263744K), 0.0003545 secs]
[GC 17070K->1646K(263744K), 0.0005288 secs]
[GC 17070K->1644K(263744K), 0.0006449 secs]
[GC 17068K->1644K(263744K), 0.0090768 secs]
Forcing a full GC run...
[Full GC 11674K->1621K(263744K), 0.0617646 secs]
Disabling verbose GC logging
The GarbageCollector beans provide information about all the garbage collectors in use on a given JVM. They allow you to query collection counts and total collection time.

Below, we’ll do some work that should trigger at least a couple GC runs, and then print out stats.

1_000_000.times { "foo" + "bar" }
 
total_collections = 0
total_time = 0.0
for gc_bean in gc_beans
  puts "Name: #{gc_bean.name}"
  puts "  Pools: #{gc_bean.memory_pool_names.to_a.join(', ')}"
  puts "  Collections: #{gc_bean.collection_count}"
  puts "  Time: #{gc_bean.collection_time}"
  total_collections += gc_bean.collection_count
  total_time += gc_bean.collection_time
end
puts "Total collections: #{total_collections}"
puts "Total time: #{total_time}"
puts

And here’s the output for my JVM:

Name: ParNew
  Pools: Par Eden Space, Par Survivor Space
  Collections: 18
  Time: 38
Name: ConcurrentMarkSweep
  Pools: Par Eden Space, Par Survivor Space, CMS Old Gen, CMS Perm Gen
  Collections: 2
  Time: 149
Total collections: 20
Total time: 187.0

You may notice that each garbage collector has several memory pools associated with it. Let’s look at the memory pools now.

Memory pools represent areas of memory used to isolate objects with differing characteristics. For example, some JVMs will list a “Par Eden Space”, which represents the memory pool containing the newest and youngest objects. Some JVMs list a “Perm Gen” which will contain objects only collected during full GC runs.

Let’s iterate over all memory pools and get information on each:

for pool_bean in mempool_beans
  puts "Name: #{pool_bean.name}"
  puts "  Type: #{pool_bean.type}"
  puts "  Peak usage: #{pool_bean.peak_usage}"
  puts "  Current usage: #{pool_bean.usage}"
  puts "  Usage after last collection: #{pool_bean.collection_usage}"
  puts "  Managers: #{pool_bean.memory_manager_names.to_a.join(', ')}"
end
puts

Here’s the listing of memory pools for me:

Name: Code Cache
  Type: Non-heap memory
  Peak usage: init = 167936(164K) used = 1004992(981K) committed = 1019904(996K) max = 33554432(32768K)
  Current usage: init = 167936(164K) used = 1004992(981K) committed = 1019904(996K) max = 33554432(32768K)
  Usage after last collection: 
  Managers: CodeCacheManager
Name: Par Eden Space
  Type: Heap memory
  Peak usage: init = 12582912(12288K) used = 15794176(15424K) committed = 15794176(15424K) max = 25165824(24576K)
  Current usage: init = 12582912(12288K) used = 10244904(10004K) committed = 15794176(15424K) max = 25165824(24576K)
  Usage after last collection: init = 12582912(12288K) used = 0(0K) committed = 15794176(15424K) max = 25165824(24576K)
  Managers: ConcurrentMarkSweep, ParNew
Name: Par Survivor Space
  Type: Heap memory
  Peak usage: init = 2097152(2048K) used = 49824(48K) committed = 2621440(2560K) max = 4194304(4096K)
  Current usage: init = 2097152(2048K) used = 2048(2K) committed = 2621440(2560K) max = 4194304(4096K)
  Usage after last collection: init = 2097152(2048K) used = 2048(2K) committed = 2621440(2560K) max = 4194304(4096K)
  Managers: ConcurrentMarkSweep, ParNew
Name: CMS Old Gen
  Type: Heap memory
  Peak usage: init = 251658240(245760K) used = 1682040(1642K) committed = 251658240(245760K) max = 490733568(479232K)
  Current usage: init = 251658240(245760K) used = 1662536(1623K) committed = 251658240(245760K) max = 490733568(479232K)
  Usage after last collection: init = 251658240(245760K) used = 1659912(1621K) committed = 251658240(245760K) max = 490733568(479232K)
  Managers: ConcurrentMarkSweep
Name: CMS Perm Gen
  Type: Non-heap memory
  Peak usage: init = 12582912(12288K) used = 12861856(12560K) committed = 21250048(20752K) max = 67108864(65536K)
  Current usage: init = 12582912(12288K) used = 12861856(12560K) committed = 21250048(20752K) max = 67108864(65536K)
  Usage after last collection: init = 12582912(12288K) used = 12749352(12450K) committed = 20967424(20476K) max = 67108864(65536K)
  Managers: ConcurrentMarkSweep

Note that the “usage” values are actually instances of MemoryUsage, which provides four different values for the pool:

  • init: the amount of memory the JVM initially reserved from the OS
  • used: the amount of memory currently used
  • committed: the amount of memory actually available
  • max: the maximum memory that will ever be available

These will change over time as objects are collected and the heap grows to meet application needs. Let’s show the current usage for the Par Eden Space, as an example:

par_eden = mempool_beans.find {|b| b.name == "Par Eden Space"}
usage = par_eden.usage
puts "Initially requested: #{usage.init}"
puts "Currently used: #{usage.used}"
puts "Total available: #{usage.committed}"
puts "Max potential: #{usage.max}"
puts

The output for “Par Eden Space”:

Initially requested: 12582912
Currently used: 10575384
Total available: 15794176
Max potential: 25165824

Memory pools also allow you to set usage thresholds. When a threshold is crossed, the JVM will increment a counter. This allows you to monitor whether or not your application is regularly exceeding a particular threshold, so you can adjust memory settings or work to reduce your application’s demands.

Here’s what that might look like for the “Par Survivor Space,” the “old” pool. We’ll monitor the collection usage threshold, or the amount of space in use after the most recent GC run. Some JVMs will support this or the general usage threshold, but neither are required to be supported.

We’ll set a limit that’s 5MB larger than the last collection usage and then run another loop that keeps memory alive a bit longer.

par_old = mempool_beans.find {|b| b.name == "Par Survivor Space"}
if par_old.collection_usage_threshold_supported?
  par_old.collection_usage_threshold =
    par_old.collection_usage.used + 5_000
 
  1_000.times do
    ary = []
    10_000.times do
      ary << "foo" + "bar";
    end
  end
 
  puts "Crossed it: #{par_old.collection_usage_threshold_count} times"
end

And the result:

Crossed it: 104 times

Crossing the threshold here means that the minimum usage of the pool after a collection went over this size that many times. In this example, we create a 10,000 element array 1000 times and then throw it away; but while it’s alive, it gets promoted into this “Survivor” space. So we repeatedly create it, promote it, and release it, resulting in the threshold being crossed several times.

The final bean we haven’t covered, the MemoryManager bean, is usually just the superclass for other beans like GarbageCollector, so there’s not much more to say about them. On my system, the list of MemoryManager beans is the same as the list of GarbageCollector beans.

Hopefully this walkthrough has whetted your appetite for the JVM’s management and monitoring interfaces. In future posts, we’ll cover how you can use the other builtin monitors, how you can track JRuby’s internal monitors, and even how you can register your own monitors for any application. As always, leave questions and comments!


more »

J is for JVM: Why the ‘J’ in JRuby? »

Created at: 30.11.2009 19:00, source: Engine Yard Blog, tagged: Technology jruby JVM

The current JRuby team members are all passionate hackers with intimate knowledge of Ruby, Java, and of course JRuby. That said, none of us were on the team at the project’s original inception. I assume the JRuby pioneers thought JRuby would be a good idea—I know I did, when I first heard about it. For a lot of folks though, it’s somewhat less obvious. Why is writing JRuby on top of the JVM a good idea, they ask. Are we nuts, evil geniuses, or is using the JVM just a solid pragmatic decision?

When Java first came out it was a nice incremental improvement over languages at the time. Not only was Java a reasonably simple, yet evolutionary language (vs. C or C++) but it was also a virtual machine that ran a fake non-hardware-specific bytecode (e.g. hardware-neutral). It was garbage collected, so memory management and core dumps suddenly became a minor concern. It also had a reasonably comprehensive class library. No new unique features, but a good combination of ideas packaged into something useful.

In an attempt to market Java as a simple single-named entity, though, I think Sun made a long-term marketing mistake. It co-mingled the language and the runtime under the same name. This did a huge injustice to the runtime because the Java Virtual Machine (JVM) is remarkable on its own merits. It has also made it difficult for alternative JVM language projects like Jython or JRuby to explain the virtues of the JVM without confusing it with the Java language.

This post will focus on the good and the bad of the JVM, and by the end of this entry, I’m hoping you’ll appreciate all the JVM has to offer.

One final note before I delve further: this will just cover what I know about Sun’s JVM. There are a few other JVMs out there and they’re all remarkable in their own right. Many of my points no doubt also apply to those VM implementations, but my lack of intimate knowledge of them compels me to disclaim :)

The Bad

No technology is perfect. The JVM is certainly no exception. If you work on a non-Java language implementation you realize that you’re not quite in Eden. It provides a great base for other languages, but the VM wasn’t originally designed with other languages in mind. This forces us to get creative at times.

For example, we maintain our own frame stack as heap-allocated objects. If we had our way, we would manipulate the real frame stack directly and add the additional fields we need to maintain. It’s too bad we cannot have better control over the framestack. Bummer.

It’s also a bummer that every time we create a Ruby Fixnum value, we’re boxing those values in a Java object. The C implementations of Ruby are just passing tagged ints. If only we had faster fixnums. Java primitives do not cut it since they will not fit into heterogeneous lists without boxing.

Aside: you may have also noticed that the JVM takes some time to start up :)

People who have to write high-performance code for the JVM will also find themselves frequently frustrated by the black box nature of the VM. Once you load your bytecode, you turn a crank and hope you come up a winner. You’re never sure how much of a winner you will be… black boxes, by definition, are a mystery.

With all of that said (and believe me we have a laundry list of features not mentioned that we would like to see), I am super happy that our implementation is on top of the JVM. Why?  Read on…

The Good

Hotspot

For starters, I’ll say that although Hotspot is a mystery, it does a generally great job at performance. The truth is that dynamic profiling is the path to excellent performance. HotSpot is much more clever than we are. It has the benefit of profile data from the running application to inform itself.

It also has the ability to de-optimize code. As strange as it sounds, this detail ends up being the catalyst to great performance.  When Hotspot does an optimization it puts a cheap guard in front of the optimization to make sure the rationale for the optimization still holds true. If the guard ever fails then it de-optimizes back to a slow path. The reason why this rocks, is that Hotspot can then be super speculative with aggressive optimizations. If the informed bet pays off, it pays off huge. If it doesn’t pay off, it gets reset with additional knowledge for a future potential optimization and some extra cost for trying out the not-quite-right optimization.

Here’s a demonstration of Hotspot in action:

In this graph we’re running a mandelbrot generator many times and plotting the amount of time it takes to generate it each time. If you look at JRuby 1.4.0 on the right side JRuby is clearly smoking Ruby 1.8.7 and doing very well against 1.9.2preview2. If you look at the progression as JRuby starts, you can see that we start slower than 1.8.7 but rapidly pick up steam as Hotspot kicks in.

There’s one interesting slowdown which occurs in iteration 6; this is actually a Hotspot deoptimization occurring. You’ll also notice that performance returns to the same level. Later optimizations end up getting that performance back.

I cannot imagine writing something as good as Hotspot. It has been worked on by a lot of smart folks for close to a decade now. The improvements between Java 4, 5, and 6 are pretty impressive. As JRuby, we not only get Hotspot for free with no work on our part, but we also get improvements every time it’s upgraded. This is a pretty nice perk to using the JVM. As I said nothing is perfect, but this really is an easy decision… and we’ve not even begun to talk about Garbage Collection (GC).

Garbage Collection (GC)

The Java folks have literally spent centuries worth of hours debugging, testing, and improving their VM. Think about a building full of engineers working on just garbage collection for a decade and a half! The JVM ships with multiple garbage collectors, so even in cases where workloads confound one GC in the JVM, the JVM actually allows you to use one of the other garbage collectors. On top of that, you can tune any of the collectors you use. You can tailor the collector for your application.

The various collectors also boast many delicious attributes. They are all compacting, so memory fragmentation is not an issue. They’re all incremental to reduce the length of GC pauses which can happen. They are also generational, so that short-lived objects can be collected much more quickly.

Some are parallel so that collection work can be spread out over multiple cores. There are even concurrent collectors so that there may be no collection pause at all. Cool stuff which is also battle-tested. JRuby gets all of this for free. Now with Java 7 (and u12 of Java 6) they even have a new ‘G1′ collector. It just keeps improving.

There are two more neat things about GC and the JVM: both are about getting visualization and information about how well GC is coping. The first is the -J-verbose:gc flag. This option will log all collection events along with when, how much, and how long they take. This is super useful for getting a picture on how good or bad the collector is at handling your workload:

[GC 16000K->3727K(82496K), 0.0396636 secs]
[Full GC 13021K->5802K(82496K), 0.1468975 secs]
[GC 21802K->9769K(82496K), 0.0292348 secs]
[GC 25769K->12535K(82496K), 0.0243674 secs]
[GC 28535K->13136K(82496K), 0.0169928 secs]
[GC 29136K->15498K(82496K), 0.0213308 secs]
[GC 31498K->16911K(82496K), 0.0213301 secs]
[GC 32911K->19413K(82496K), 0.0186457 secs]
[GC 35413K->20207K(82496K), 0.0146396 secs]

You can tally these events and figure out total time spent just cleaning up garbage and also figure out if perhaps your workload is overburdening the collector. This may lead you to change your design or perhaps try and tune it out by changing your heap size.

The second is just querying a JVM through jconsole. jconsole has many views into the system, including ways of manipulating aspects of the system, but it has a nice memory tab to show how well GC is working:

In the lower right corner you can see green boxes showing how full different generations of memory are.  If, for example, you saw a nearly full survivor generation (the empty third bar above), then that means slow full GC collection times, and it is very unlikely to be a healthy application.

Portability

Both GC and Hotspot are available wherever Java is available. This means JRuby will not just work on other platforms, but that it will work well on other platforms. It’s really nice to get this support for free.

On top of that, JRuby is working in exotic locations. We get bug reports for JRuby bugs on OpenVMS. People tell us about Rails applications running on JRuby on an IBM mainframe running CP/CMS! I am not making this up. Crazy and cool to think that people have an oasis in JRuby on OSes like this. We can thank the portability of the JVM for this.

In fact, since the open sourcing of Java and the OpenJDK project, we are seeing even more proliferation of the platform.  I think the portability proposition is about as perfect as it can get.

Maturity

The JVM is over 15 years old—maybe even old enough to get into an R-rated movie. At JavaOne the parallel was drawn between a child coming of age and the longevity of Java and the JVM. Over the last decade and a half an impressive amount of engineering effort has been expended to make the JVM stable, fast, and pervasive. Oh yeah… we get that for nothing! Is this sounding familiar?

On the Radar

JRuby and other language projects on the JVM have been acknowledged by JVM engineers. A concrete example is the invokedynamic specification (aka JSR292). This JSR is tailored for new languages and the fact that Java method dispatch rules are not the only ones which matter on the JVM. It is nice that the JVM wants to be a better platform for projects like JRuby.

There is also a MLVM (multiple language VM) project which is a clearinghouse for new features which may make it into the JVM. It’s at least a place where enterprising developers can try to add features they want to see in the JVM. The JVM engineers themselves interact and make suggestions. This looks like it’s becoming a great incubator for new JVM features.

These two developments make me happy that the JVM is expanding its scope and watching what people are using it for.

Time Marches On

The JVM has become a durable part of the technology landscape. The uncertainty of an Oracle/Sun merger and any potential shenanigans makes for salacious technology press, but the truth is mundane. So many behemoth companies (like Oracle and IBM and HP and SAP and…. and…) have invested so much money in writing middleware for the JVM that it’s difficult to see their commitment to the JVM changing in the next decade.

What this means for JRuby is additional performance gains and additional opportunities to help influence the market towards a language like Ruby, and Ruby on Rails. We’re doing what we can to help Rubyists satisfy their desires in a Ruby runtime. At the same time, we’re also showing Java folks that Ruby is a language with lots of advantages over Java for many applications, but one that they don’t have to change their platform to use.


more »