Antares Trader Blog

The universe at your fingertips

Ruby on Erlang

Tuesday

Aug 12, 2008

6:03 pm

I have been thinking about concurrent programming recently. Ruby makes OO programming easy, enjoyable and intuitive. It is a great development tool, but its concurrency model is out of the dark ages. I have been interested in Erlang for a while, but never really sat down to look it over. Last weekend while I was away from the internet, I sat down and read through the Erlang documentation. Erlang makes concurrency easy and natural, but its syntax is poor, static, and high on ceremony, and the silly punctuation at the end of lines is so 1980s. There are no classes, and even the "records" it has are complier macros and not very useful.

The thought that entered my head and has not left is this: if Ruby can run on top of the Java Virtual Machine and take advantage of all the power of that environment while still being an expressive and efficient language, why not have a modified version of ruby on top of BEAM, the Erlang virtual machine. I've been thinking about how this new language might work, and this post is the result to my thoughts. I'm hoping someone in the community will point out the obvious deficiencies that I missed.

Let's start by pointing out what make Erlang work with regard to concurrency.

    <li>Share nothing processes</li>
    <li>Message based communication</li>
    <li>Cheep efficient processes</li>
    <li>No variable data (<a href="http://groups.google.com/group/comp.lang.ruby/msg/e809a7a7205eaec9">See this post    for why that is important</a>)</li>
    

Now let's see what we really value in Ruby's object model:

    <li>Everything is an object</li>
    <li>Open classes</li>
    <li>Introspection</li>
    <li>Closures</li>
    

My though is let's make every Ruby class instance an Erlang process. If one takes a peek under the hood at how ruby classes work you will see that they already work by sending messages. In fact there is even a 'send' method that lets you send specific messages without using the Ruby dot syntax. What we get is concurrency without explicitly having to ask for it.

What I'm going to present is an idea about how to write a Ruby like language on top of Erlang. I spent some of this week trying to sketch a bit of something out and gave up when Erlang's Syntax and static variables got the best of me. I'm posting this hoping someone who sees the value in such a language will also have the skills to help write it. Here goes:

Description of a Ruby-like Concurrent Object Language:

A quick note before I start, for our purposes process, send, receive, and message have the same meaning as in Erlang, while method, class, inheritance, and call are as understood in Ruby.

Every class is represented by a process loop. Classes are initialized by creating an new process with a initial function that sets up plumbing, then calls the initialize method and then starts a tail-recursive loop. The argument to the loop is a structure that contains instance variables, method pointers, modules and the parent class of the class. For the rest of this post I am going to refer to this collection as the state of the instance.

The loop's primary job is to receive method call messages. Instance methods are forwarded along with the current state the the instances class, while class methods are handled directly by the Class instance. All method calls need to spawn a new process to run in so other requests (particularly ones generated by the methods themselves) can be dispatched.

Method call messages need to include a call stack so that method result messages can be sent back. Intelligent use of this stack allow tail-recursion and I think I can work out how to do lazy evaluation as well.

Calling a method would work like this. A Method Call Message sent to an instance's process. If the instance contains a method matching that name, it goes on to run the method, if not it sends the Method Call Message to it's parent. Once the method is found it must be run. The first step is to divide Methods into those that change the state and those that do not. Read-only Methods are started in a new process with a copy of the state of the instance, and the loop is repeated. Read-Write Methods are called with the same semantics, but a modified loop is entered that will ignore other Read-Write methods until the first is finished. This prevents to methods from clobbering each others changes. When a Read-Write method is done, it not only sends a message to the top of the call stack with the resulting value, it also sends a message back to the calling instance with the now modified state. On receiving this message, the instance knows it is safe to process the next Read-Write Method Call using the new state.

The call stack can also be use to good effect. By passing it around tail recursion and continuations be come a simple matter of sending messages with a modified call stack.

In Ruby the standard syntax for calling a method is the dot operator (foo.bar). Under the hood this would fire off a Method Call Message to foo and wait to receive a Method Response before moving on. An alternate syntax (foo!bar) would call bar asynchronously and instead of returning the result of the call would return a response object that will morph into the result of the method call once it is complete. I can envisions a number of different ways to use such an object which this post is to small to hold. A few examples are lazy evaluation where a pending response can be passed around, and only when the data is actually needed would I block execution. Another though would be have a 'should' method that checks the response when it comes in, and executes a block if the value is not the expected one.

Interestingly if the result of a Method is not stored anywhere and it is called asynchronously there is no need to even return a response.

A final thought is to add a meta keyword to this language that would let the programmer manipulate the plumbing of the class more directly. For example using 'def' would be syntactic sugar for 'self.meta.define_function'. Other functions like freezing, tainting, and cloning could be done in this way, but it would be primarily used to modify the behavior of methods, classes and processes as they relate to the underlying messaging system.

Conclusions:

By mapping objects onto processes this way, parallelism is allowed to emerge naturally form the language without having to do the low level work of process management. It is important that process creating and message passing are as blazingly fast as Erlang's because this structure will create a lot of both. This does not mean that a program as parallel as the number of processes. Many of the processes will actually be suspended waiting for the next message, but as I said above naturally parallel operations will emerge from the the code without having to be explicitly described, in much the same way that modern memory managers allocate ands release memory without the need for malloc and friends.

I think this is important for two reasons. The first is summed up very will in the now famous post "The Free Lunch Is Over: A Fundamental Turn Toward Concurrency in Software" by Herb Sutter. Basically, processors are going to get more parallel, but are not likely to get too much faster. Secondly, programming in general and Ruby in particular are beginning to be dominated by web app development. Web apps are inherently parallel processes. Further a single process is almost always strongly IO bound leaving lots of processor time on the table waiting for network sockets and databases to get return data. The tremendous success of HTTP handlers like evented-mongrel, ebb and thin should be proof enough that concurrency is far more important then one thread per processor.

There have been many projects that have tried to bolt the Actor (Erlang like) model onto Ruby. But as far as I can find no one has tried to make it part of the fundamental class structure. To see a few examples of the discussion look at this InfoQ interview with MenTaLguY, Revactor (my comments), and this very informative post to comp.lang.ruby by David Masover (also linked above). Finally I want to link to a document I found about a language called Act1 that seems to have the same ideas, but still insists on 1980s clunky syntax.

Proof-of-concept code is now on GitHub.

edit delete