Subscribe

CouchFoo: ActiveRecord styled API for CouchDB

4

Written on February 4, 2009 by george

CouchDB is an excellent database, designed especially for distributed applications. To quote the official site site:

Apache CouchDB is a distributed, fault-tolerant and schema-free document-oriented database accessible via a RESTful HTTP/JSON API. Among other features, it provides robust, incremental replication with bi-directional conflict detection and resolution, and is queryable and indexable using a table-oriented view engine with JavaScript acting as the default view definition language.

along with the knowledge it’s written in Erlang, you know it’s going to go be a winner in the future.

CouchDB logo For one of my current freelance projects I needed to store data in a document fashion – ie unstructured. This made CouchDB an ideal candidate. There were several ruby gems available: CouchPotato, CouchREST, ActiveCouch and RelaxDB gems. Each offered its own benefits and own challenges. After hacking with each I couldn’t get a library was happy with. So I started with ActiveRecord and modified it to work with CouchDB. And so CouchFoo was born.

In the end I ended up with a gem that mirrors ActiveRecord in all but a few minor places. In particular:

  • CouchDB is schema free so property defintions for the document are defined in the model (like DataMapper)
  • :select, :joins, :having, :group, :from and :lock are not available on find or associations as they don’t apply (locking is handled as conflict resolution at insertion time)
  • :conditions can only accept a hash and not an array or SQL. For example :conditions => {:user_name => “Georgio_1999″}
  • :offset is less efficient in CouchDB – there’s more on this in the rdoc
  • :order is applied after results are retrieved from the database. Therefore :order cannot be used with :limit without a new option :use_key. This is explained fully in the quick start guide and CouchFoo#find documentation
  • :include isn’t implemented yet but the finders and associations still accept the option so you won’t need to make any code changes
  • By default results are ordered by document key. The key uses a UUID scheme so these don’t auto-increment and are likely to come out in a different order to insertion. default_sort can be used on a model to sort by create date by default and overcome this
  • validates_uniqueness_of has had the :case_sensitive option removed
  • Because there’s no SQL there’s no SQL finder methods
  • Timezones, aggregations and fixtures are not yet implemented
  • The price of index updating is paid when next accessing the index rather than the point of insertion. This can be more efficient or less depending on your application. It may make sense to use an external process to do the updating for you – see CouchFoo#find for more on this
  • On that note, occasional compacting of CouchDB is required to recover space from old versions of documents and keep performance high. This can be kicked off in several ways (see quick start guide)

The RDoc for the gem contains more details on each of these differences, new features that I added, a quick start guide and additional areas of responsibility to think about when using CouchDB (in particular performance).

As a quick overview, basic operations are the same as ActiveRecord:

class Address < CouchFoo::Base
property :number, Integer
property :street, String
property :postcode # Any generic type is fine as long as .to_json can be called on it
end
address1 = Address.create(:number => 3, :street => "My Street", :postcode => "secret") # Create address
address2 = Address.create(:number => 27, :street => "Another Street", :postcode => "secret")
Address.all # = [address1, address2] or maybe [address2, address2] depending on key generation
Address.first    # = address1 or address2 depending on keys so probably isn't as expected
Address.find_by_street("My Street") # = address1

As key generation is through a UUID scheme, the order can’t be predicted. However you can order the results by default:

class Address < CouchFoo::Base
property :number, Integer
property :street, String
property :postcode # Any generic type is fine as long as .to_json can be called on it
property :created_at, DateTime

default_sort :created_at
end
Address.all # = [address1, address2]
Address.first    # = address1 or address2, sorting is applied after results
Address.first(:use_key => :created_at) # = address1 but at the price of creating a new index

Note that there’s an optimisation that will order results by created_at if there are no conditions so in the above case, the default_sort wasn’t required. However when using with conditions it will be required so it makes sense to use at all times.

Conditions work slightly differently:

Address.find(:all, :conditions {:street => "My Street"}) # = address1, creates index on :street
Address.find(:all, :conditions {:created_at => "sometime"}) # Uses same index as :use_key => :created_at
Address.find(:all, :use_key => :street, :startkey => 'p') # All streets from p in alphabet, reuses the index created 2 lines up

As well as providing support for people using relational databases, CouchFoo attempts to provide a library for those wanting to use CouchDB as a document-orientated database:

class Document < CouchFoo::Base
property :number, Integer
property :street, String

view :number_ordered, "function(doc) {emit([doc.number , doc.street], doc); }", nil, :descending => true
end
Document.number_ordered(:limit => 75) # Will get the last 75 documents in the database ordered by number, street attributes

Associations work as expected but you must to remember to add the properties required for an association (we’ll make this automatic soon):

class House < CouchFoo::Base
has_many :windows
end

class Window < CouchFoo::Base
property :house_id, String
belongs_to :house
end

There’s a few bits left to tidy up (as noted in the readme) but generally speaking it’s now ready for use by others. Grab it on github and feel free to fork and send me pull requests.

And now to do something I’ve not being doing a lot of lately, spend some more time on the Couch…

Share and Enjoy:
  • Digg
  • del.icio.us
  • Facebook
  • Google Bookmarks
  • Slashdot
  • StumbleUpon
  • LinkedIn

Rails with Datamapper

5

Written on January 8, 2009 by george

With the recent announcement that Rails and MERB will merge and my preference for DataMapper I decided to plug datamapper into rails for my next freelance project. The theory goes this should make the upgrade path to Rails 3 a lot simpler!

It’s currently possible to use Datamapper with Rails, heck even DHH himself commented so, but it’s not quite easy as using ActiveRecord. After a quick Google I only ran into question of how to do it, no howto guide. So I set out to make mine own – it really was quite simple in the end:

sudo gem install addressable data_objects do_mysql
# do_mysql can be changed for do_postgres or do_sqlite3 as appropriate
sudo gem install dm-core dm-more

In the dm-more github repos there’s a folder called rails_datamapper which is a plugin for rails to add datamapper support. This doesn’t install with the dm-more gem so it’s a case of cloning the git repository and copy the folder to your rails project:

git clone git://github.com/sam/dm-more.git
cp -R dm-more/rails_datamapper /vendor/plugins

Then edit your project environment.rb file and add the following lines:

# Load the required gems in the correct order
config.gem "addressable", :lib => "addressable/uri"
config.gem "data_objects"
config.gem "do_mysql"
config.gem "dm-core"

# Make datamapper load first as some plugins have dependencies on it
config.plugins = [ :rails_datamapper, :all ]

# Remove ActiveRecord if you no longer need it
config.frameworks -= [ :active_record ]

The connection to the database will be made by the rails_datamapper plugin using your database.yml configuration file. You’ll need to use a slightly different format for datamapper:

development:
:repositories:
:adapter: mysql
:database: opnli_dev

Or alternately you can specify your own initializer and forgo the rails plugin:

hash = YAML.load(File.new(RAILS_ROOT + "/config/database.yml"))
DataMapper.setup(:default, hash[RAILS_ENV])

The only real gotcha in using datamapper is some rails plugins assume you’re using ActiveRecord. Hopefully this won’t be the case in the future, but for now you’ll need to get forking!

Share and Enjoy:
  • Digg
  • del.icio.us
  • Facebook
  • Google Bookmarks
  • Slashdot
  • StumbleUpon
  • LinkedIn

Git

0

Written on December 21, 2008 by george

For those readers of my blog who don’t live in the rails world I highly recommend checking out Git, a distributed version control system. It has been big in the rails world since early this year for several good reasons:

  • It has distributed and offline functionality
  • Making and merging branches is a breeze – encouraging you to try experiments in branches
  • It uses much less space than alternatives, such as Subversion, and only has one .git folder at the base of your project
  • It’s in active development with constant releases of new features (but stable enough to be used for the linux kernel)

The terminology is slightly different from subversion and friends but once you’ve got used to it you never look back!

Merb was very quick to jump on the git bandwagon and rails followed not much later. Practically this made distributed development a hell of a lot easier, but it also had some nice knock on effects. Patching is now a lot quicker too – you simply fork the project, make a fix and inform the admin who can then choose to merge back into the master (if they see fit). It’s made the process for fixing bugs a hell of a lot quicker.

Soon after git came along the fantastic github.com followed making it easy to host remote repositories. And so to the reason for me writing this post – github just launched git pages where you can upload your own page to front your repositories. It’s a neat idea and naturally is all managed through a git repository. You simply create your site in a repository, push to github and the deployment is automatic. Although it’s only simple HTML pages, it’s a great proof of concept of other things that could be possible. My effort can be found here which following the git ethos I just forked from somebody else

Share and Enjoy:
  • Digg
  • del.icio.us
  • Facebook
  • Google Bookmarks
  • Slashdot
  • StumbleUpon
  • LinkedIn

Ruby Manor

1

Written on November 23, 2008 by george

Ruby Manor Frame Yesterday I spoke at and attended Ruby Manor. It was a grass route conference with the attendees determining the agenda and the organising duo aiming to keep costs to an absolute minimum. So successful were they infact, and as one twitterer aptly points out, that for an amazing £12 you got a series of excellent talks, no annoying sponsorship and £500 behind the bar at the end of the night. Given even the cheapest of conferences aimed at freelancers clocks in well over the £100 mark, this really was a fantastic achievement.

On a personal note I presented on nanite which is a background processing solution for rails and merb. It was quite a technical talk and I was trying to get a lot into the 30 minutes timeslot, so it felt a little rushed. Nevertheless most people seemed to grab the concept and the live demo at the end went really well. For those that missed the event I’m sure the videos will be online soon, but in the meantime there’s an excellent coverage on this blog and the slides from my presentation are available on slideshare or below:

Share and Enjoy:
  • Digg
  • del.icio.us
  • Facebook
  • Google Bookmarks
  • Slashdot
  • StumbleUpon
  • LinkedIn

New plugins

0

Written on September 26, 2008 by george

I’ve just pushed two plugins to github. The first is an improvement on the standard Defensio plugin that only checks the validity of your API key when posting articles or comments. This is better than checking each time a model that uses the plugin is instantiated as it doesn’t require contact with the Defensio API (so is faster) and also won’t bring your site to a standstill if someone is just viewing a page and the Defensio service is down.

The second updates the highly useful timed fragment cache plugin by Richard Livsey to support Rails 2.1

Share and Enjoy:
  • Digg
  • del.icio.us
  • Facebook
  • Google Bookmarks
  • Slashdot
  • StumbleUpon
  • LinkedIn