Liminal Existence

MapReduce in 36 Lines of Ruby

This has been burning a hole in my head since August, after Joel’s post made it blindingly obvious that Ruby is the perfect language for distributed programming. I have some code that properly implements partitioning, etc, but never got around to finishing it sufficiently for a proper release. Here’s the core idea; if anyone wants the partitioning code, ping me at romeda@gmail.com. mapreduce_enumerable.rb:

require 'rubygems'
require 'ringy_dingy'
require 'ruby2ruby'

module Enumerable
  def dmap(&block)
    self.each_with_index do |element,idx|
      ring_server.write([:dmap, Process.pid, block.to_ruby, element, idx])
    end

    results = []
    while results.size < self.size
      result, idx = ring_server.take([:dmap, Process.pid, nil, nil]).last(2)
      results[idx] = result
    end

    results
  end

  def ring_server
    return @ring_server if @ring_server

    ringy_dingy = RingyDingy.new nil
    @ring_server = ringy_dingy.ring_server
  end
end
mapreduce_runner.rb:

require 'rubygems'
require 'ruby2ruby'
require 'ringy_dingy'

ringy_dingy = RingyDingy.new nil
ring_server = ringy_dingy.ring_server

loop do
  pid, block, element, idx = ring_server.take([:dmap, nil, nil, nil, nil]).last(4)
  begin
    result = eval(block).call(element)
  rescue Object => err
    result = err
  end
  puts "Got #{result} from #{element} for #{pid}."
  ring_server.write([:dmap, pid, result, idx])
end
From the shell:
$ sudo gem install RingyDingy
$ sudo gem install ruby2ruby
$ ring_server &
$ ruby mapreduce_runner &
$ ruby mapreduce_runner &
From irb:

> require 'mapreduce_enumerable'
> (1..100).to_a.dmap { |v| v * 2 }

Comments