Queueing async_observer tasks from the command line

For one of our projects, we’re using beanstalkd in combination with the async_observer plugin for Rails to run slow tasks in the background.

This is a nice setup. Although we’re not using all of its more advanced features, beanstalkd is a very powerful and efficient queueing system, and async_observer provides a clean integration with Rails. (On other projects, I’ve happily used delayed_job for similar purposes, but it was not suitable for this project due to its heavy dependency on a ActiveRecord and a SQL database. By contrast, async_observer’s support for ActiveRecord was quite easy to rip out.)

However, we would like to be able to queue async_observer jobs from cron or the command line. Why launch a new process with a copy of the Rails stack when there’s already a long-lived one running? So, I wrote a ruby script with minimal dependencies which can stick tasks in beanstalkd for our async_observer worker to run.

The Script

Here’s the script I wrote. Stick it in RAILS_ROOT/script/beanstalk-queue.

   1  #!/usr/bin/env ruby
   2  
   3  #
   4  # == Synopsis
   5  #
   6  # This is a replacement for script/runner which avoids loading up the
   7  # rails environment and instead queues things for async_observer to
   8  # do. Note that, unlike script/runner, a filename is not accepted; you
   9  # must provide the code to run on the command-line.
  10  #
  11  # == Usage
  12  #
  13  # beanstalk-queue [OPTION] ... 'CODE'
  14  #
  15  # -d DELAY:
  16  #   amount of delay before the job is run (default 0 seconds)
  17  #
  18  # -e ENVIRONMENT:
  19  #   set the rails environment (falls back to ENV['RAILS_ENV'] or
  20  #   'development' if unset)
  21  #
  22  # -p PRIORITY:
  23  #   set the job priority (default 65536)
  24  #
  25  # -t TTR:
  26  #   set the job's time to run (default 120 seconds)
  27  #
  28  # -v, --verbose:
  29  #   be verbose
  30  #
  31  # -h, --help:
  32  #   display this help
  33  #
  34  # CODE: the ruby code to be called asynchronously
  35  #
  36  
  37  require 'rubygems'
  38  
  39  require 'beanstalk-client'
  40  require 'getoptlong'
  41  require 'rdoc/usage'
  42  require 'yaml'
  43  
  44  RAILS_ROOT = File.expand_path(File.join(File.dirname(__FILE__), ".."))
  45  
  46  opts = GetoptLong.new(
  47    ['--help', '-h', GetoptLong::NO_ARGUMENT],
  48    ['-d', GetoptLong::REQUIRED_ARGUMENT],
  49    ['-e', GetoptLong::REQUIRED_ARGUMENT],
  50    ['-p', GetoptLong::REQUIRED_ARGUMENT],
  51    ['-t', GetoptLong::REQUIRED_ARGUMENT],
  52    ['--verbose', '-v', GetoptLong::NO_ARGUMENT]
  53  )
  54  
  55  pool = Beanstalk::Pool.new(%w(localhost:11300))
  56  
  57  tube = 'default'
  58  delay = 0
  59  environment = ENV['RAILS_ENV'] || 'development'
  60  priority = 65536
  61  ttr = 120
  62  verbose = false
  63  
  64  opts.each do |opt,arg|
  65    case opt
  66    when '--help'
  67      RDoc::usage
  68    when '-d'
  69      delay = arg.to_i
  70    when '-e'
  71      environment = arg
  72    when '-p'
  73      priority = arg.to_i
  74    when '-t'
  75      ttr = arg.to_i
  76    when '--verbose'
  77      verbose = true
  78    end
  79  end
  80  
  81  if ARGV.length != 1
  82    $stderr.puts "missing code to run (try #{File.basename(__FILE__)} --help)\n"
  83    exit 1
  84  end
  85  
  86  code = ARGV[0]
  87  
  88  case environment
  89  when 'production'
  90    if File.exist?(File.join(RAILS_ROOT, "REVISION"))
  91      tube = File.read(File.join(RAILS_ROOT, "REVISION")).strip
  92    end
  93  when 'development'
  94    tube = 'ourproj-development'
  95  end
  96  
  97  puts "using tube #{tube}" if verbose
  98  
  99  pool.connect
 100  pool.use(tube)
 101  job = YAML.dump({:type => :rails, :code => code})
 102  
 103  puts "job:", job if verbose
 104  
 105  job_id = pool.put(job, priority, delay, ttr)
 106  
 107  puts "job #{job_id} sent to #{pool.last_server}" if verbose

Download this code.

Usage

Because this code does not load Rails or your application, it is unfortunately unable to determine how you have configured your application. Therefore, you must configure the script itself.

First, if you’re not using the default beanstalkd configuration of one server running on localhost:11300, edit line 55 appropriately. Next, you’re probably not using the same beanstalkd tube naming convention as we are, so edit lines 88–95 to properly map rails environment names to beanstalkd tube names.

Full documentation for running the script is at the top of the file, or run script/beanstalk-queue --help, but basically it works like script/runner with some extra optional parameters for controlling the queueing:

$ ./script/beanstalk-queue -e production 'MyModel.some_long_task'

cron and whenever integration

If you use whenever for dealing with programatically your crontab file, here’s a bit of code to define a new beanstalk job type, analogous to the runner job type:

   1  module ::Whenever
   2    module Job
   3      class Beanstalk < Whenever::Job::Default
   4        def output
   5          path_required
   6          %Q(#{File.join(@path, 'script', 'beanstalk-queue')} -e #{@environment} #{task.inspect})
   7        end
   8      end
   9    end
  10  
  11    class JobList
  12      def beanstalk(task, options = {})
  13        options.reverse_merge!(:environment => @environment, :path => @path)
  14        options[:class] = Whenever::Job::Beanstalk
  15        command(task, options)
  16      end
  17    end
  18  end

Download this code.

I just stuck this code near the top of my config/schedule.rb file, though a possibly cleaner approach would be to put it in its own file and then require that file from schedule.rb.


blog comments powered by Disqus