Liminal Existence

FoWA Miami

Just a quick note that I’ll be giving a workshop on building real-time web applications using Jabber at the Future of Web Apps in Miami, on February 28th. The conference runs from the 28th to the 1st of March, and should be a lot of fun.

We’ve been gradually improving the Jabber stack on Twitter, and we’re now sending millions of messages every day, doing things that just don’t fit into the polling-based world of Atom feeds. There are a ton of extremely awesome things that can be built, and so far we’ve just scratched the surface.

More to come; if I don’t start blogging these things in small pieces, they’ll never come.

Stability

Most of the Twitter team’s work in the weeks leading up to launching Blocks was to ensure that it wouldn’t fall over as soon as we released it. It’s an extremely punishing application, loading 10 timelines on every occasion that someone looks at it. So far, the servers haven’t even noticed.

There have been a number of Twitter hiccups in the past few weeks, but they’ve all been weird, random bugs. Which is not to make excuses, but rather to say that in spite of (very time-consuming) challenges along the way, we’ve been myopically focused on making the site faster and more reliable. As evidence, here’s a graph of page load times, as seen from an external observer:

Twitter Load Times, as monitored by an external observer, over the past month.

We’re going to keep building a faster and more reliable Twitter. We’re also going to add some awesome new features, and soon. Possibly better than contact search and GMail, even! Finally, we’ll have more visualizations from the Stamen folks. Britt is off to Berlin for RailsConf mid-September. We’ll then have more details about what we’re doing to push Rails and Twitter.

These Are the People …

Folly: “In architecture, a folly is an extravagant, frivolous or fanciful building, designed more for artistic expression than for practicality.” – via Tom Coates, by way of Tom Carden.

We just released Twitter Blocks, a nice little visualisation done by the good folks at Stamen Design. It’s fun! Go play!

Stamen’s recent work highlights the playfulness inherent to Twitter. I can’t wait to release more of these interfaces, and hope that it inspires similar work. Sam Ruby, Tim Bray and others have recently weighed in with their long bets. I’m willing to put down that playfulness — of the sort that Stamen, Schulze & Webb and Jane McGonigal explore and invent daily — is so important to who we are as people that the tech world won’t be able to ignore it for much longer.

Not exactly a risky bet, but too often the tech industry just ignores these things, so there it is, just for kicks.

SELECT * FROM Everything, or Why Databases Are Awesome.

I’ve just committed a patch to ActiveRecord that prevents a large number of very, very bad queries from hitting your database. Go update your code, ASAP.

We’ve made some pretty significant progress towards scaling Twitter, and we’re now at the point where the majority of requests that hit our site complete in less than 70 ms (mostly API requests), and the really complicated front-end pages that we display complete in less than 160 ms. There are still a lot of hiccups, so the average is higher than that, but we’re constantly working on getting it down.

One of the consistent problems we’ve been facing is errant queries. We’ve been seeing (off and on) queries like:

SELECT * FROM statuses WHERE user_id = 234223 ORDER BY created_at

If you know anything about relational databases, this is a very bad thing, especially when you have users that have more than 20,000 statuses.

One major downside of having an object-relational mapper is that you don’t always control what goes on behind the scenes. In tracking down this problem, first we investigated all our code, and weren’t able to find the source of these problems. Switching tactics, we isolated some test cases that replicated the problem and brought out the big guns: print. This pretty quickly brought us to an obscure corner of the ActiveRecord source (three cheers for source code!), where it became apparent that Rails was doing these gigantic loads from the database every time we saved even a single field in a related object. There are a bunch of mitigating circumstances that mean that this bug doesn’t get triggered all the time, but it’s still really really bad.

Thankfully, the patch will be committed soon has been committed (32 minutes patch-to-commit!), and no-one will have to deal with, as Coda put it: “Arg stabby stab stab stabbity fuck stab” anymore. The fact that no-one noticed really speaks to how freaking awesome relational databases (in our case, MySQL) are these days.

Perhaps underlying all of this is the simple fact that most of the time, ActiveRecord and Rails in general is pretty solid, and Ruby underneath is a fully sound language with which to build high-volume services. Kevin over at PowerSet has more on the topic - they’ve recently announced that they’ll be doing their front-end development in Ruby (up until now, it’s just been a glue language internally).

Twitter at RailsConf

Sadly, I won’t be attending RailsConf in Portland this weekend. I’ll be finishing up at XTech, followed by a (well-deserved) week relaxing / adventuring in Morocco. Photos will follow.

In the meantime, if you’re one of the lucky few many attending RailsConf, Alex and Britt will be speaking 10:45 AM on Sunday. They may or may not being doing a reprise of my Scaling Twitter talk, but I’m sure it will be fantastic in any event.

Wish I were there, have fun all!

Social Software for Robots Slides Up

I’ve uploaded the Social Software for Robots slides for the messaging/jabber talk that Kellan and I gave yesterday at XTech 2007.

I think the talk went well, aside from a hiccup getting all the stars aligned to show the visualization that Tom helped us put together. For those who missed it, I’m hoping that Kellan or I or both will give reprises at other upcoming conferences. There’s a lot of potential here, and things are finally reaching the point where real-time APIs are not only becoming a reality, but a necessity.

XTech has been great fun so far, and there are a number of talks that I’m looking forward to today and tomorrow. Huge thanks to Edd and everyone else who’s helped put it together.

Scaling Twitter, the Talk.

Simon Willison linked to an interview with Alex Payne, one of my co-workers on Twitter. This caused a bit of a stir, so apparently there’s some interest in our experience scaling Twitter, and Rails.

We’ve been extremely happy with Rails, and make use of the multitude of helpers that it offers us - like any application on any stack, though, providing fast response times to a (rapidly) growing number of users is a challenge. The solutions are often tightly coupled to the application and its characteristics, and while scaling the most trafficked Rails site in the world, we’ve run into situations where existing solutions weren’t enough.

This process has led us to build a number of tools that help us deal with our load, and just as soon as we find some spare time, we’ll be releasing many of them. In the meantime, you can find out first what sorts of challenges we’ve encountered and solutions we’ve come up with at my talk at the SDForum Silicon Valley Ruby Conference next weekend (April 21-22nd).

I’ll be focusing on ActiveRecord and database optimization, caching, and of course, Messaging. I’ll also touch on some areas where we haven’t had great successes (yet), and hopefully someone from the audience will shout out that there’s some totally obvious and awesome thing that we haven’t thought of, and it’ll save us weeks of work (no, I’m serious. Does someone want to take bets?).

MapReduce in 36 Lines of Ruby

This has been burning a hole in my head since August, after Joel’s post made it blindingly obvious that Ruby is the perfect language for distributed programming. I have some code that properly implements partitioning, etc, but never got around to finishing it sufficiently for a proper release. Here’s the core idea; if anyone wants the partitioning code, ping me at romeda@gmail.com. mapreduce_enumerable.rb:

require 'rubygems'
require 'ringy_dingy'
require 'ruby2ruby'

module Enumerable
  def dmap(&block)
    self.each_with_index do |element,idx|
      ring_server.write([:dmap, Process.pid, block.to_ruby, element, idx])
    end

    results = []
    while results.size < self.size
      result, idx = ring_server.take([:dmap, Process.pid, nil, nil]).last(2)
      results[idx] = result
    end

    results
  end

  def ring_server
    return @ring_server if @ring_server

    ringy_dingy = RingyDingy.new nil
    @ring_server = ringy_dingy.ring_server
  end
end
mapreduce_runner.rb:

require 'rubygems'
require 'ruby2ruby'
require 'ringy_dingy'

ringy_dingy = RingyDingy.new nil
ring_server = ringy_dingy.ring_server

loop do
  pid, block, element, idx = ring_server.take([:dmap, nil, nil, nil, nil]).last(4)
  begin
    result = eval(block).call(element)
  rescue Object => err
    result = err
  end
  puts "Got #{result} from #{element} for #{pid}."
  ring_server.write([:dmap, pid, result, idx])
end
From the shell:
$ sudo gem install RingyDingy
$ sudo gem install ruby2ruby
$ ring_server &
$ ruby mapreduce_runner &
$ ruby mapreduce_runner &
From irb:

> require 'mapreduce_enumerable'
> (1..100).to_a.dmap { |v| v * 2 }

The Year in Cities, 2006.

Here’s the list of cities I stayed in during the course of 2006. Memes away. I’ve only included cities in which I spent at least one day and one night. I’ve preemptively added a few spots I’ll be visiting over Christmas, too.

San Francisco, CA
Fresno (FresYES!), CA
Tahoe, CA/NV
Seattle, WA
Vancouver, BC
White Rock, BC
Victoria, BC
Whistler, BC
Reykjavik, Iceland
Ólafsvík, Iceland
Belfast, Northern Ireland
Aberystwyth, Wales
Norwich, East Anglia
London, England
Lewes (Brighton & Hove), East Sussex
Barcelona, Spain
Portland, OR
Grande Prairie, AB
Santa Cruz, CA
Mojave Desert & Las Vegas, CA & NV
Zion National Monument, UT
Tucson, AZ
[Near] Calistoga, CA
Chilliwack, BC

All in all, pretty good, though I’m hoping for maybe somewhere more less-European for 2007. We’ll see.