Buckblog: Inside Capistrano: the Gateway implementation

Inside Capistrano: the Gateway implementation

26 September 2006 — 4-minute read

For those arriving late to the party, Capistrano is a utility for executing commands in parallel on multiple remote hosts. You can read all about it in the Capistrano manual.

Most Capistrano users have probably never needed to use its gateway feature. I find that vaguely ironic, since it was one of the features in Capistrano that were on the original list of requirements when I sat down to code it all up.

Basically, what the gateway lets you do is tunnel your connections through a single computer. This lets you connect to computers that are behind a firewall, or on a VPN. We use this feature all the time at 37signals, since the bulk of our cluster is not directly accessible via the Internet.

# specify the gateway server
set :gateway, "gateway.server.com"

# the following servers are behind a firewall and
# cannot be accessed directly
role :app, "app.server.com"
role :web, "web.server.com"
role :db,  "db.server.com", :primary => true

The gateway code is a bare 100 lines long, including comments. Basically, all it does is establish a connection to the gateway machine, and then for every connection established via the gateway, it forwards a port from the local host to the requested server. Then, it establishes a connection to the requested server via that forwarded port. It makes heavy use of threads to accomplish this, and is one of the places that helped iron out several synchronicity issues in Net::SSH. In fact, the code is a good showcase of what you can do with forwarded ports in Net::SSH.

So, let’s take it all apart and walk through the code, beginning with the initialize method. (For those of you that want to follow along, the file in question is capistrano/gateway.rb.)

def initialize(server, config)
  @config = config
  @next_port = MAX_PORT
  @terminate_thread = false
  @port_guard = Mutex.new

  mutex = Mutex.new
  waiter = ConditionVariable.new

  @thread = Thread.new do
    SSH.connect(server, @config) do |@session|
      mutex.synchronize { waiter.signal }
      @session.loop { !@terminate_thread }
    end
  end

  mutex.synchronize { waiter.wait(mutex) }
end

(In the interest of keeping things compact, I’ve removed the comments and the lines related to logging.)

The meat of this is the Thread.new statement there in the middle. All it does is establish the gateway’s SSH connection. (The config instance variable is a Capistrano::Configuration instance, from which various SSH options are pulled, including the user, password, port, etc.) Once the connection is live, the block will be called, and we signal the “waiter” (the condition variable). This wakes up the calling thread (which is blocked in the wait call following the thread). Once the connection is live, we enter the session loop, which goes until asked to terminate (the terminate_thread instance variable).

Note that SSH.connect is another Capistrano abstraction that basically wraps the lower-level Net::SSH.start. There’s not much to it; you can read the entire thing in capistrano/ssh.rb.

Once the gateway connection is live, other connections may be established through it by calling the connect_to method, passing in a string that names the target server.

def connect_to(server)
  connection = nil
  port = next_port

  thread = Thread.new do
    begin
      @session.forward.local(port, server, 22)
      connection = SSH.connect('127.0.0.1', @config, port)
    rescue Errno::EADDRINUSE
      port = next_port
      retry
    rescue Exception => e
      puts e.class.name
      puts e.backtrace.join("\n")
    end
  end

  thread.join
  connection or raise "Could not establish connection to #{server}"
end

def next_port
  @port_guard.synchronize do
    port = @next_port
    @next_port -= 1
    @next_port = MAX_PORT if @next_port < MIN_PORT
    port
  end
end

For this bit, we first get the next (possibly) available port on the local host. Then, in a thread, we start a forwarded port from the local host to the remote host, and try to establish an SSH connection through it. If the port turns out to be in use, we grab the next port and try again.

And that’s it, really.

There isn’t that much to the gateway implementation, but we like it that way. It is one of the most critical parts of Capistrano for us at 37signals, and the current implementation is both simpler than before (compare it to the version in Capistrano 1.1) and more robust. You could even conceivably use the gateway code directly in your own scripts, if you ever needed to connect to one or more hosts through a forwarded port. Something like this:

require 'capistrano/gateway'
require 'capistrano/logger'

# First, we create a config object that quacks,
# mostly, like a Capistrano::Configuration object.
config = Struct.new(:user, :password, :ssh_options,
  :logger).new
config.user = "username"
config.password = "password"
config.ssh_options = {}
config.logger = Capistrano::Logger.new(
  :output => "/dev/null")

# Connect to the Gateway...
gateway = Capistrano::Gateway.new("gateway", config)

# Establish a connection to an internal machine via 
# the gateway
host = gateway.connect_to("internal")

# "host" is now an SSH session object. We can
# manipulate it using the Net::SSH API.
host.open_channel do |ch|
  ch.on_data do |ch, data|
    puts(data)
  end

  ch.exec "hostname"
end

host.loop

For more information on Net::SSH, you can tackle the Net::SSH documentation. It tries to be fairly comprehensive.

This is the first in what I hope will become a series of articles, detailing various internals of Capistrano. If there are any specific aspects of Capistrano you’d like discussed, feel free to leave your vote in the comments.

Reader Comments

This is great! Many thanks James.

Mark Orr
26 Sep 2006

I came across your site via Technorati. I've immediately added it to my list of feeds, as I am a huge fan of the entire 37signals team and your contributions to the web. The reason I'm posting is because I'm fairly new to deployment with Capistrano. I believe I have everything setup correctly, but after deployment, my site just hangs there (it times out and FastCGI processes never start). It seems as though it's an environment issue, but I can't be so sure of that. Do you have any suggestions to this problem? You can leave a comment on my site or here as well.

Ryan
26 Sep 2006

Ryan, welcome! You might want to consider joining the Capistrano mailing list and seeing if anyone can shed any light on it there. Either go to http://groups.google.com/group/capistrano, or send an email to [email protected].

Jamis
26 Sep 2006

Hey man... Nice blog! Are you using any mephisto-plugin or liquid-tag to show these codes highlited?

ArthurGeek
27 Sep 2006

ArthurGeek, it's actually a feature of typo, which mephisto also implements. You use the typo:code tag, and specify the lang attribute for the type of the content. It's hard to actually demonstrate in a comment, though, because it'll get sucked in and interpreted!

Jamis
27 Sep 2006

The Buckblog

assorted ramblings by Jamis Buck

Inside Capistrano: the Gateway implementation

Reader Comments