Buckblog: Net::SSH Refactoring Adventure

Net::SSH Refactoring Adventure

9 October 2004 — 4-minute read

This article is just the first in a series describing the steps I am going through to refactor Net::SSH to use dependency injection. I’m currently using the DI framework code-named Syringe to do the dirty work. (Note: that link for Syringe is temporary only, and may not be valid if you are reading this article more than a few weeks after it was posted.)

As I mentioned in On the Road to a New DI Container, I had already started refactoring Net::SSH to use Copland. Thus, most of the transport layer was already refactored.

The challenge, therefore, was to take what worked in Copland, and find the “right” way to do it with Syringe.

Original Approach—Pre-Refactoring

Let me begin by describing the approach that is currently used in Net::SSH 0.1. To keep things comprehensible, I’ll focus just on one small part of the transport layer, the part that manages the HMAC algorithms.

The original implementation puts all the HMAC logic in a single file: net/ssh/transport/mac.rb. This defines two classes: HMACFactory, and HMAC.

The HMAC class is a wrapper for the actual OpenSSL HMAC implementations. It also handles the case where no HMAC algorithm has been selected, by returning an empty string for the requested HMAC digest.

The HMACFactory has a hard-coded case statement of all known OpenSSL HMAC implementations that are simultaneously supported by the SSH2 specification. It allows a client to specify an SSH2 HMAC algorithm name, and a key, and have the corresponding HMAC instance returned to them.

This has three drawbacks:

It is closely coupled to OpenSSL, making it impossible to swap a new cryptography backend into Net::SSH.
It hard-codes the required digest and key lengths, and the digest implementation for each algorithm.
It does not easily allow new HMAC algorithms to be added to the factory.

Refactoring for Copland

Allowing Multiple Crytography Backends

The first step involved adding another layer of abstraction, so that the cryptography backend (OpenSSL) could be swapped in and out. This was pretty simple—I just created a new package, “net.ssh.transport.ossl”, that defined a set of configuration points (hashes, basically), and factories.

There was one configuration point for each type of entity: keys, digests, ciphers, and yes, HMAC algorithms.

Each configuration point was fed as a dependency into a corresponding factory, that knew how to query the configuration point and initialize and return the requested service.

The HMAC package

I refactored this by breaking it into one (trivial) class for each HMAC implementation, instead of one class that can wrap any implementation. Each class, then, knows how long its digest and key need to be, thus decentralizing that information. (This resulted in 6 new classes, one each for SHA1, SHA1-96, MD5, MD5-96, and None, and one abstract parent class that each of them descend from.)

Then, I defined a package.yml file that described each new class (except the parent class) as a service.

Each of the five services in this package was then fed into the HMACAlgorithms configuration point of “net.ssh.transport.ossl”.

Summary of this Step

What did all that buy us, besides multiplying the number of classes? Well:

It made it MUCH easier to add new HMAC algorithms in the future. In fact, third parties can now add new HMAC algorithms without even having to touch the existing Net::SSH code!
It made it possible to swap the OpenSSL cryptography backend out and replace it with something else. (Not that there really IS anything else at the present, but I know of at least one other endeavor to create a cryptography library in Ruby.)

Refactoring for Syringe

There really wasn’t any refactoring to do at this stage, since it was all done in Copland. All that was left was to find the right way to define the services using Syringe. It turned out to be much more straightforward than Copland (and I believe that Copland’s approach is pretty darn straightforward!).

Disclaimer

The Syringe API is still being researched, and is liable to change. The approach described here will probably not be valid in the future, and is given only as an example of how a Ruby-based DI container might represent services in a complex system.

Defining Services

The package.yml file for the Copland-based implementation of the HMAC package is 40 lines (including whitespace and description elements). The package.rb file for the Syringe version is less than half that long—18 lines.

Here’s the contents of package.rb:

  require 'syringe'

  # Defines the services that implement the various SSH2 HMAC algorithms
  # that are supported by OpenSSL.
  Syringe.register_library_namespace( 'net/ssh/transport/ossl/hmac', :hmac) do |space|

    %w{sha1 sha1-96 md5 md5-96 none}.each do |name|
      space.register( name.sub(/-/, "_").intern ) do
        require 'net/ssh/transport/ossl/hmac/#{name}'
        Net::SSH::Transport::OSSL::HMAC.const_get( name.upcase.sub(/-/, "_").intern ).new
      end
    end

    if space.knows_key?( :hmac_algorithm_sources )
      space.hmac_algorithm_sources << space
    end

  end

See the line that invokes register_library_namespace? That creates a callback in the Syringe system, so that when a container is sent the require message with the given path as a parameter, this callback will be invoked. A new namesapce (called, in this case, :hmac) will be added to that calling container, and passed to the block.

The block programmatically registers all known HMAC implementations with the namespace. (Note that part of the construction process is to require the appropriate file—this reduces runtime overhead by only requiring the files for services that are actually used.)

Lastly, note the knows_key? call. Every container responds to both has_key? and knows_key?. If sent the has_key? message, the container will return true if the named service is contained within itself, and false otherwise. The knows_key? message, on the other hand, returns true if the service exists in the container, or any of its ancestor namespaces, and false otherwise. Thus, the call shown above will add the current namespace to a collection of namespaces called hmac_algorithm_sources, which must exist somewhere in the hierarchy of namespaces above the current namesapce (“space“).

The Buckblog

assorted ramblings by Jamis Buck