Antares Trader Blog

The universe at your fingertips

Collecting the Garbage in Ruby

Monday

Feb 04, 2008

10:47 am

I was saddened to see that James Grey is going to end his tenure as the "Ruby Quiz" quiz master. It looks like there exist a group that could take over but I will miss James' Insightful wrap-ups. I learned a great deal from trying my hand at these quizzes. I bring this up because of something I learned about Ruby in one particular quiz "Twisting a Rope" I didn't do so well at this quiz, but in trying to pick my benchmarks up off the floor I learned an important lesson: Memory management is still important.

I was reminded of this lesson by two articles at the Nimble Methods blog: "Guerrilla's Guide to Optimizing Rails Applications" and "Garbage Collection is Why Ruby on Rails is Slow"

The first article in particular mentions the relationship between memory and processing time. Allocating memory causes the Garbage collector to run which in turn take time. The article gives some remarkably accurate correlations between the amount of memory and the time it takes to run. Their (very valid) solution is to allocate less memory. In ActiveRecord there are all sorts of places to do this. The code was not written to be lean, but instead to be easy and quick to write for. Making a general purpose library almost always means a trade off in efficiency somewhere.

There is, however, another piece to the Garbage Collection puzzle that I learned in the Ruby Quiz I mentioned. That is that as a programmer one should intentionally run the Garbage Collector at the best times. When are the best times? There are two conditions that one ought to look for. The obvious one is any place where performance doesn't count. If there are points in the life of you program where it is somehow between tasks, a quick call to GC.start can leave you with a nice clean memory profile for the next run.

Another time to consider manually deploying the Garbage Truck is when you know there is a lot of garbage. That is call GC.start when your program is at points where very few object are in scope. This not only gives you the most bang for you 70ms of processing time, but because the GC is a mark and seep verity, it will be take less time to mark when there are fewer visible objects. There is also an inversion of this idea that says you should disable Garbage Collection (GC.disable) when you are about to allocate a bunch of memory that you know is going to be in scope for a while. My preference is to Monkey patch the GC module by adding GC.suspend which take a block that is called with the Garbage Collector off then turns the GC back on. Think File.open here

A final thought on memory management is to mention an article by Ola Bini called "Ruby closures and memory usage" reminding us all that long-lived Proc objects can carry around references to unneeded objects.

Long and the short of things: When performance starts to matter memory should also. Ruby may move the process out of site but it should not be out of mind.

edit delete