Getting Started with Rubinius II: Coding!

posted by crafterm, 11 October 2007

In a previous article we examined the process of checking out Rubinius, building it from source and discussed its directory structure. In this article, we’ll take it one step further and examine the process of implementing an example method that can be contributed back to the project as a patch for inclusion in the official Rubinius source base.

If you haven’t checked out or built Rubinius please see my previous post which details the preliminary steps required before we can start implementing.

The feature we’ll implement is the File.link method, the implementation is quite simple and only a few lines of code but it will take us through the process of adding a method to an existing class with an existing spec, and will also take us into the system call layer where we’ll interact with the underlying operating system to perform a symlink.

In this case it’s not required however generally it’s a good idea to run the rake dev:setup rake task before implementation to ensure that we have pristine copies of our runtime archives available. We do this because the compiler itself requires that the runtime archives work, and if we introduce a defect it’s possible to enter the situation where we cannot compile a fix.

dev:setup essentially makes a backup of the runtime archives that will always be used for compilation. In our particular case the compiler doesn’t create any symlinks so this step is optional but it’s a good idea if you’re working on existing code or low level methods such as File.stat, Hash, Array, etc to do so.

Normally when using git we would create a feature branch, implement our specs and changes on that branch, commit it locally and then rebase the source code off the master branch before pushing it to the main repository (this is how Rubinius committers integrate their work into the main line development). In this article we’ll omit these stages as they’re well documented on the Rubinius project pages, and here we want to focus on the changes to be made to Rubinius itself.

Specification

Back to our new feature – a spec already exists for File.link and it’s in the spec/core/file/link_spec.rb file:

require File.dirname(__FILE__) + '/../../spec_helper'

describe "File.link" do
  before do 
    @file = "test.txt"
    @link = "test.lnk"     
    File.delete(@link) if File.exists?(@link)
    File.delete(@file) if File.exists?(@file)
    File.open(@file, "w+")
  end

  platform :not, :mswin do
    it "link a file with another" do
      File.link(@file, @link).should == 0
      File.exists?(@link).should == true
      File.identical?(@file, @link).should == true
    end

    it "raise an exception if the target already exists" do
      File.link(@file, @link)
      should_raise(Errno::EEXIST) { File.link(@file, @link) }
    end

    it "raise an exception if the arguments are of the wrong type or are of the incorrect number" do
      should_raise(ArgumentError) { File.link }
      should_raise(ArgumentError) { File.link(@file) }
    end
  end

  after do
    File.delete(@link)
    File.delete(@file)
  end
end

The core specification suite is laid out in the spec/core directory using the convention of a having a spec file per method on each class containing all behaviour for that corresponding method. Platform and bootstrap specs are in the spec/platform and spec/bootstrap directories respectively.

Examining the specification above, there’s three tests that are run on all non-mswin platforms (ie. those supporting the creation of symlinks). The tests ensure that when called, File.link creates a symlink between the source and target, or raises an exception either if the target already exists or if it’s given incorrect arguments.

This identifies what we need to implement.

Let’s run the spec to see what’s failing:

$> bin/mspec -f s spec/core/file/link_spec.rb

File.link
- link a file with another  (ERROR - 1)
- raise an exception if the target already exists (ERROR - 2)
- raise an exception if the arguments are of the wrong type or are of the incorrect number (ERROR - 3)


1)
File.link link a file with another  FAILED
No method 'link' on an instance of Class.: 
    Object(Class)#link (method_missing) at kernel/core/object.rb:98
                        main.__script__ at spec/core/file/link_spec.rb:14
                              Proc#call at kernel/core/context.rb:262
                          SpecRunner#it at spec/mini_rspec.rb:337
                                main.it at spec/mini_rspec.rb:369
                        main.__script__ at spec/core/file/link_spec.rb:24
                          main.platform at ./spec/core/file/../../spec_helper.rb:96
                        main.__script__ at spec/core/file/link_spec.rb:30
                              Proc#call at kernel/core/context.rb:262
                    SpecRunner#describe at spec/mini_rspec.rb:347
                          main.describe at spec/mini_rspec.rb:365
                        main.__script__ at spec/core/file/link_spec.rb:3
                              main.load at kernel/core/compile.rb:78
                   main.__eval_script__ at (eval):8
                             Array#each at kernel/core/array.rb:526
                  Integer(Fixnum)#times at kernel/core/integer.rb:19
                             Array#each at kernel/core/array.rb:526
                   main.__eval_script__ at (eval):5
                CompiledMethod#activate at kernel/core/compiled_method.rb:110
                        Compile.execute at kernel/core/compile.rb:34
                        main.__script__ at kernel/loader.rb:170
..snip..
$> 

From the stacktraces we can see:

No method 'link' on an instance of Class.

indicates that File.link doesn’t even exist inside the current implementation of File.

Design

The corresponding source file to implement File.link is in kernel/core/file.rb:

# depends on: io.rb

class File < IO
  ..snip..

  def self.new(path, mode)
    return open_with_mode(path, mode)
  end

  def self.open(path, mode="r")
    raise Errno::ENOENT if mode == "r" and not exists?(path)

    f = open_with_mode(path, mode)
    return f unless block_given?

    begin
      yield f
    ensure
      f.close unless f.closed?
    end
  end

  def self.exist?(path)
    out = Stat.stat(path, true)
    if out.kind_of? Stat
      return true
    else
      return false
    end
  end

  def self.file?(path)
    st = Stat.stat(path, true)
    return false unless st.kind_of? Stat
    st.kind == :file
  end

  ..snip..
end

Here we see methods implementing various parts of the File API. The above methods show the implementation of File.new, File.open, File.exist? and File.file? (to compare MRI’s implementation of the above methods check the file.c source file in the Ruby tar.gz source archive).

Lets look a first implementation of File.link. The primary behaviour of File.link is to create a hard link between two filenames. To do this we need to invoke the link(2) system call on the underlying operating system to create the link.

A quick examination of the link(2) man page yields:

$> man 2 link

LINK(2)             BSD System Calls Manual            LINK(2)

NAME
     link -- make a hard file link

SYNOPSIS
     #include <unistd.h>

     int
     link(const char *name1, const char *name2);

DESCRIPTION
     The link() function call atomically creates the specified directory entry
     (hard link) name2 with the attributes of the underlying object pointed at
     by name1 If the link is successful: the link count of the underlying
     object is incremented; name1 and name2 share equal access and rights to
     the underlying object.

     ..snip..

RETURN VALUES
     Upon successful completion, a value of 0 is returned.  Otherwise, a value
     of -1 is returned and errno is set to indicate the error.

     ..snip..

STANDARDS
     The link() function is expected to conform to IEEE Std 1003.1-1988
     (``POSIX.1'').

According to the man page, link(2) accepts the source and target of the symlink as paramaters, and returns an integer indicating success or failure.

FFI

To invoke link(2) we need to add a new method to the ffi layer inside of Rubinius. ffi stands for ‘foreign function interface’, and it’s a really neat way of being able to interact with system calls on the underlying operating system without needing to write a lot of stub or native integration code.

ffi bindings are compiled into the platform.rba archive, and since link(2) conforms to a POSIX standard the file we need to modify is kernel/platform/posix.rb.

Opening kernel/platform/posix.rb we’ll see blocks of code such as the following inside the Platform::POSIX module:

# file system
attach_function nil, 'access', [:string, :int], :int
attach_function nil, 'chmod',  [:string, :int], :int
attach_function nil, 'fchmod', [:int, :int], :int
attach_function nil, 'unlink', [:string], :int
attach_function nil, 'getcwd', [:string, :int], :string
attach_function nil, 'umask', [:int], :int

This code dynamically attaches methods to the module, and specifies the parameter types and return values of each method.

The general format of the ‘attach_function’ method is as follows:

__attach_function___ library, __method name__, [ parameters ], return value
  • library, library name to load dynamically, nil otherwise
  • name, name of the method to attach, this is also the name the method will be available as inside the module
  • parameters, array of symbols identifying the types this method accepts as parameters
  • return value, type of the return value

(attach_function can also accept several other formats of parameters, please take a closer look at kernel/platform/ffi.rb for more details)

Symbols are defined for most primitive types, ie: :short, :int, :long, :string, :char, etc, which can be used in the parameter list and return value specifier.

Following the examples above, link(2) can be attached to the Platform::POSIX module with one line of code:

attach_function nil, 'link', [:string, :string], :int

After adding this line of code to the Platform::POSIX module, we need to update the platform.rba archive to ensure it now includes knowledge of link(2) system call.

$> rake build:platform

Implementation

Now that we have access to the link(2) system call, we can invoke it via ffi from the file module.

Open up kernel/core/file.rb, and in between two existing methods, enter the following code:

def self.link(from, to)
  Platform::POSIX.link(from, to)
end

As with the platform archive, we’ll need to update the core archive:

$> rake build:core

Let’s re-run our specifications to see if it passes:

$> bin/mspec -f s spec/core/file/link_spec.rb

File.link
- link a file with another 
- raise an exception if the target already exists (ERROR - 1)
- raise an exception if the arguments are of the wrong type or are of the incorrect number


1)
File.link raise an exception if the target already exists FAILED
Expected EEXIST, nothing raised: 
          main.should_raise at ./spec/core/file/../../mspec_helper.rb:27
            main.__script__ at spec/core/file/link_spec.rb:21
                  Proc#call at kernel/core/context.rb:262
              SpecRunner#it at spec/mini_rspec.rb:337
                    main.it at spec/mini_rspec.rb:369
            main.__script__ at spec/core/file/link_spec.rb:24
              main.platform at ./spec/core/file/../../spec_helper.rb:96
            main.__script__ at spec/core/file/link_spec.rb:30
                  Proc#call at kernel/core/context.rb:262
        SpecRunner#describe at spec/mini_rspec.rb:347
              main.describe at spec/mini_rspec.rb:365
            main.__script__ at spec/core/file/link_spec.rb:3
                  main.load at kernel/core/compile.rb:78
       main.__eval_script__ at (eval):8
                 Array#each at kernel/core/array.rb:526
      Integer(Fixnum)#times at kernel/core/integer.rb:19
                 Array#each at kernel/core/array.rb:526
       main.__eval_script__ at (eval):5
    CompiledMethod#activate at kernel/core/compiled_method.rb:110
            Compile.execute at kernel/core/compile.rb:34
            main.__script__ at kernel/loader.rb:170

3 examples, 1 failures
$>

We’re in better shape, two spec’s are now passing, including the link test – we’re successfully creating a hard link between 2 filenames, but one spec is still failing in the area of handling error conditions, in particular when the target filename already exists.

Lets update our File.link implementation appropriately:

def self.link(from, to)
  raise Errno::EEXIST if exists?(to)
  Platform::POSIX.link(from, to)
end

and naturally, rebuild the core:

$> rake build:core

and re-run our specification:

$> bin/mspec -f s spec/core/file/link_spec.rb

File.link
- link a file with another 
- raise an exception if the target already exists
- raise an exception if the arguments are of the wrong type or are of the incorrect number


3 examples, 0 failures

hooray, all link specifications passed.

If the specs for File.link are complete (ie. document all areas of File.link’s behaviour), we are ready to submit a patch back to the Rubinius community. Alternatively, if some behaviour is lacking from the specs, we could now iterate through the above process adding a spec to document additional behaviour, and implement it following TDD/BDD practices until all expected behaviour has been added.

Patch

To create a patch we can use git and issue the command:

$> git diff > file_link.diff

This will create a patch for us containing the changes we made across the entire Rubinius project. We can then send this back to the community for inclusion into the official Rubinius repository, by submitting it in a Rubinius Lighthouse ticket.

Summary

We’ve stepped through the process of implementing a feature in Rubinius by examining the behaviour of a particular method via it’s corresponding specification tests. As part of the implementation we’ve added a binding to an underlying operating system call via the ffi layer in Rubinius, and then called upon that binding in the class where the functionality is expected.

We then ensured that all required behaviour including error conditions have been met by making sure the spec test suite passes. Finally we’ve created a patch using git that we can submit back to the Rubinius project via lighthouse.

Implementing a feature in Rubinius can certainly be as straightforward and as easy as what we’ve seen above. There’s many specifications that have been written that don’t have corresponding implementations, so pick a class, check it’s specs, write an implementation and join in on building a fantastic, extendable and awesome Ruby virtual machine! :)

Getting started with Rubinius

posted by crafterm, 05 October 2007

Rubinius is an alternate implementation of the Ruby virtual machine, loosely based on the architecture and implementation of Smalltalk-80.

The primary difference between Rubinius and MRI (aka Matz Ruby) is that it’s modeled around the design of a small, light and fast C kernel, with the surrounding language, libraries and classes implemented in the target language, Ruby. MRI on the other hand, is a larger body of C code.

Rubinius also compiles Ruby classes into byte code before execution and also includes an RSpec test suite that (when complete) documents the Ruby language, core library and Rubinius compiler.

“what can be written in Ruby, will be”

The focus on using Ruby where possible opens the implementation up to a much wider audience of contributors, and I certainly encourage you to take a look and implement a few core library methods or write some RSpec tests. The barrier of entry is quite low, some methods can even be implemented with a single line of code.

The Rubinius team have published several point releases in the past few months, however the latest and greatest version of Rubinius can be retrieved by checking out the project from source code control.

Recently, Rubinius switched source code control from using Subversion to Git. In this article we’ll step through the process of checking out Rubinius, building it, and examining the projects layout. In a future article we’ll look at implementing a simple method to step through the process of building a patch that can be submitted back to the project.

Checking out Rubinius

Since Rubinius is managed by Git, you’ll need to install it for your platform first. The Git home page is http://git.or.cz/, which has the Git source, and also hosts binary packages for several platforms. I personally used Fink to install Git (macports also has a package for it, as does many popular Linux distributions).

Git provides a fully distributed development experience. When you check out a project using Git, you are actually cloning an upstream repository which gives you local access to all history and changes within the project. This means you can work on Rubinius when offline, and your source code control system isn’t limited by network bandwidth or connectivity.

Distributed development using Git often works with developers ‘pulling’ changes from each other (such as the Linux Kernel), without there being a central repository where modifications are sent to, Rubinius on the other hand uses Git in a similar fashion to Subversion where a central repository hosts the latest changes, and all developers ‘pull’ changes from that. To check out the latest source from this central repository, run the following command:

$> git clone git://git.rubini.us/code rubinius

This will print some interesting output while checking out the source. Note that since you’re obtaining a full copy of the repository it will take slightly longer than Subversion which normally gives you the latest versions of all source files.

$> time git clone git://git.rubini.us/code rubinius
Initialized empty Git repository in /tmp/rubinius/.git/
remote: Generating pack...
remote: Done counting 24773 objects.
remote: Deltifying 24773 objects...
remote:  100% (24773/24773) done
Indexing 24773 objects...
remote: Total 24773 (delta 15683), reused 22174 (delta 13918)
 100% (24773/24773) done
Resolving 15683 deltas...
 100% (15683/15683) done
Checking 4286 files out...
 100% (4286/4286) done

real    7m8.927s
user    0m5.006s
sys     0m2.964s
$>

Building Rubinius

Before building Rubinius ensure that you have installed any required dependencies, these are listed in the INSTALL file included in the root Rubinius directory, currently this includes:

  • GCC version 4.x http://gcc.gnu.org/
  • GNU Bison http://www.gnu.org/software/bison/
  • gmake (GNU Make) http://savannah.gnu.org/projects/make/
  • pkg-config (configuration tool) http://pkgconfig.freedesktop.org/
  • glib2 version >= 2.10 (Gtk2 base libs) http://www.gtk.org/
  • libtool version >= 1.5 http://www.gnu.org/software/libtool/
  • Ruby version >= 1.8.4 (the Ruby language) http://www.ruby-lang.org/
  • RubyGems (Ruby package manager) http://www.rubygems.org/
  • Git (source control used by rubinius) http://git.or.cz/
  • zip and unzip commands (archiving) http://www.info-zip.org

Once these are installed, building Rubinius is straightforward by running ‘configure’ and finally ‘make’:

$> cd rubinius
$> ./configure
Rubinius is configured.
$> make
cd shotgun; make rubinius
cd external_libs/libtommath; make
cc -I./ -Wall -W -Wshadow -Wsign-compare -fPIC -O3 -funroll-loops -fomit-frame-pointer   -c -o bncore.o bncore.c
cc -I./ -Wall -W -Wshadow -Wsign-compare -fPIC -O3 -funroll-loops -fomit-frame-pointer   -c -o bn_mp_init.o bn_mp_init.c
cc -I./ -Wall -W -Wshadow -Wsign-compare -fPIC -O3 -funroll-loops -fomit-frame-pointer   -c -o bn_mp_clear.o bn_mp_clear.c
cc -I./ -Wall -W -Wshadow -Wsign-compare -fPIC -O3 -funroll-loops -fomit-frame-pointer   -c -o bn_mp_exch.o bn_mp_exch.c
cc -I./ -Wall -W -Wshadow -Wsign-compare -fPIC -O3 -funroll-loops -fomit-frame-pointer   -c -o bn_mp_grow.o bn_mp_grow.c
cc -I./ -Wall -W -Wshadow -Wsign-compare -fPIC -O3 -funroll-loops -fomit-frame-pointer   -c -o bn_mp_shrink.o bn_mp_shrink.c
cc -I./ -Wall -W -Wshadow -Wsign-compare -fPIC -O3 -funroll-loops -fomit-frame-pointer   -c -o bn_mp_clamp.o bn_mp_clamp.c
cc -I./ -Wall -W -Wshadow -Wsign-compare -fPIC -O3 -funroll-loops -fomit-frame-pointer   -c -o bn_mp_zero.o bn_mp_zero.c
cc -I./ -Wall -W -Wshadow -Wsign-compare -fPIC -O3 -funroll-loops -fomit-frame-pointer   -c -o bn_mp_set.o bn_mp_set.c
...snip...
CC string.o
CC strlcat.o
CC strlcpy.o
CC subtend/PortableUContext.o
CC subtend/ffi.o
CC subtend/handle.o
CC subtend/library.o
CC subtend/nmc.o
CC subtend/nmethod.o
CC subtend/ruby.o
CC subtend/setup.o
CC symbol.o
CC tuple.o
CC var_table.o
CC subtend/PortableUContext_asm.o
LINK librubinius-0.8.0.dylib
gcc -Wall -g -ggdb3  -iquote . -iquote lib `pkg-config glib-2.0 --cflags` -Iexternal_libs/libbstring -Iexternal_libs/libcchash `pkg-config glib-2.0 --cflags`  -c -o main.o main.c
CC rubinius.bin
./shotgun/rubinius compile lib/ext/syck
Cleaning up objects...
Created rbxext.bundle
$>

From here you can run the Rubinius interpreter which is located in the shotgun directory:

$> shotgun/rubinius
sirb(eval):000> 

which will give you an sirb (ie. shotgun irb) prompt. From here you can enter code just as you would in a normal irb session.

You can also run the specs, either individually or as a suite. Rubinius includes a mini-rspec implementation called mspec written in just over a hundred lines of code so that it can self host the full test suite and runner:

$> bin/mspec -f s spec/core/file/link_spec.rb

The parameter ‘-f s’ indicates that specdoc format should be used for spec result output. In this example we’re running the specs associated with the File.link method only.

$> rake spec

will run all known good specs.

Directory structure

Browsing the root level Rubinius directory:

$> ls
AUTHORS    Makefile   Rakefile   compiler   examples   kernel     shotgun    test
INSTALL    README     THANKS     configure  extensions lib        spec
LICENSE    ROADMAP    bin        doc        hashi      runtime    stdlib
$>

The most important directories can be summarised as follows:

  • bin – shell scripts to run mspec, continuous integration, and other tools
  • compiler – rubinius compiler implementation
  • kernel – platform, bootstrap and core language/library implementation
  • runtime – compiled rubinius archives (.rba files) of the compiler, bootstrap and core library
  • shotgun – rubinius C interpreter implementation
  • spec – rspec style test suite
  • stdlib – standard library code imported from Ruby 1.8.

In addition to this there several miscellaneous files including installation, build and license information.

Generally, most Rubinius development action takes place in the kernel, spec and shotgun directories. Inside the kernel directory you’ll find a subdirectory for the bootstrap, core and platform components of Rubinius. Bootstrap is initial code that Rubinius reads and uses to start running the compiler and interpreter. Core implements the core language of Ruby, and platform provides the binding to the underlying operating system.

Integrating changes

Changes can be made to Rubinius using your favourite text editor, compiling changes depends on where you actually make a change.

Modifications made to the low level C interpreter can be built using the ‘make’ command, changes made to Ruby files (eg. in the kernel directory) can be built using one of the following rake commands:

rake build:bootstrap    # Compiles the Rubinius bootstrap archive
rake build:compiler     # Compiles the Rubinius compiler archive
rake build:core         # Compiles the Rubinius core directory
rake build:platform     # Compiles the Rubinius platform archive

(to see all available rake tasks run ‘rake -T’)

These commands will recompile any changes made to the bootstrap, compiler, core and platform source files (located in the kernel bootstrap, compiler core and platform directories respectively) and update the compiled archives in the runtime directory.

Something to be aware of is that the Rubinius compiler uses the bootstrap and core archives itself, so if you accidentally introduce a defect and break a class such as File, Hash, or Array, etc, it’s quite likely the compiler will no longer work, leaving you in a state where you can’t recompile a fix to the breakage. To handle this catch-22 situation if you’re working on some critical methods, run the ‘dev:setup’ rake task. ‘dev:setup’ ensures that compilation occurs with pristine copies of the bootstrap, core, platform archives which will be unaffected in case of an error.

Summary

So far we’ve covered checking Rubinius out from source, building and running some simple tests with a brief discussion of the project’s layout. In a future article I’ll walk through implementing a small method to step through the process of creating a patch that can be submitted back to the Rubinius project. In the mean time, feel free to join the #rubinius IRC channel on irc.freenode.net, and read the forums/pages at http://rubini.us/forums.

RejectConf Europe 2007

posted by crafterm, 22 September 2007

RailsConf Europe saw another sensational edition of RejectConf take place. This was my second RejectConf, the first being at this years US RailsConf in Portland, and it was a resounding success.

The idea is essentially to provide a platform for anyone to speak about anything related to Ruby/Rails/etc in a forum where presentations are 5 minutes or less (at the US RejectConf the speakers even nominated the length of their talks upfront). In the past, prototypes, ideas, and demo’s of future applications, gems, plugins and frameworks have been presented that have later turned into valuable projects, and its also a fantastic social night with loads of cheering, heckling and fun had by all.

This year I presented a camping application called GuitarZero I wrote together with Lachlan Hardy at RailsCamp to handle our highscores for Guitarhero, which was extremely popular during the camp.

I also recorded two of my fellow Australian’s presentations on my MacBook Pro – John Barton and Dr. Nic. The recordings turned out well considering they were made with the inbuilt camera/mic on my MacBook Pro.

Australian presenters were particularly prevalent on the night, as you’ll be able to tell from the cheering in the video’s (fantastic!), with Ian, Max, myself and John all demoing cool things we’ve been working on.

Enjoy :)