• Alex did a good introduction talk to CouchDB at Scotland on Rails. Towards the end of the talk he did an overview of the current ruby plugins/gems available for interfacing with CouchDB, one of which was my own CouchFoo. Alex’s opinion was that any ORM for CouchDB should be as thin as possible just wrapping the Ruby to JSON object translation. I raised my opinion in the question section at the end by saying that I didn’t agree and thought the ORM should map the level of functionality available in ActiveRecord. This sparked a debate both in the talk and via Twitter of the best approach for an ORM for CouchDB to take. As a result I agreed to write this blog post to outline my views.

    CouchDB is a document orientated database with a HTTP interface amongst other features. When I first started using it I played with the database a lot via simple interactions through CURL. In the same way I feel it is important to know SQL before using any higher level API to store and retrieve objects in a relational database, I feel it is important to understand how CouchDB works before using a library to interact with it. As with most areas of computing you will find a range of opinions over what level you interact with the database - there are the purists who like to write SQL queries for each database query performed and those who are willing to sacrifice a bit of performance (maybe not having the optimum query run each time) for the time efficiencies realized whilst developing. I align quite well with the Rails mantra on this one - I’m willing to sacrifice perfect SQL each time for the efficiency gains made whilst developing. Part of Alex’s argument was that you should be as close to the database as possible because the Ruby to JSON conversion is much less than the Ruby to SQL conversion. Whilst I don’t disagree that it’s important to know how CouchDB works, I do disagree on the level at which any Ruby library should sit. I’m happy to pay a small price in terms of extra ruby code executed because I want as clean as DSL as possible.

    Whilst developing CouchDB I tried all the existing ruby libraries and as I worked through them I ran into several issues. After using ActiveRecord’s save and find methods it was particularly annoying to use a library that used different method names for the same conceptual operations (eg get instead of find). This wasn’t a major issue of course I just forked the library and made changes. But as time went on there were features that I missed from ActiveRecord. Validations, callbacks, finders and associations were the prime contenders. Then dynamic finders and named scopes got added to the list. In the end changing the existing libraries became so much work I decided to start with ActiveRecord and work from there.

    Of the features in ActiveRecord Associations are perhaps the most controversial on whether they should apply to Document orientated databases or not. The argument goes that if you’re trying to use associations you don’t understand how CouchDB should be used. I disagree on this point - a simple counter argument is presented by having a document that allows comments. Those comments could be stored inline in the document itself or in separate documents that have a reference to their parent. This is association whichever way you look at it. Which approach you decide to use will depend on your application and the characteristics of it. Incidentally Alex’s gem did a great job of this letting the user specify in the association whether they wanted the object stored inline or not. This has since been removed from his gem but is something that’s definitely on the TODO list for CouchFoo.

    For me CouchDB lends itself well to two distinct domains. Firstly domains where documents are used - that is an object where the fields that are stored to the database change depending on the object. Secondly domains where you wish to take advantage of some of CouchDB’s features not present (or poorly implemented) in relational databases - a HTTP interface, fantastic scaling ability due to bi-directional replication, and schema free nature (see this excellent article on friendfeed experience with MySQL) are just a few that spring to mind. People may use CouchDB for the second set of criteria even though their database design could be considered quite structured, and I fully expect this group of people to rise as CouchDB reaches 1.0. However that wasn’t why I wrote CouchFoo, my project fell into the first domain. Whilst I provided a way to use ActiveRecord’s higher level API I also provided access to a database object that allows simple storage and retrieval of documents by id. If that is all the functionality you require then I would expect CouchREST would be a better choice. However I believe in reality you will quickly find you need to add validations to a field, or maybe add an association or two. And as soon as you start on that slope I believe CouchFoo to be a better choice.

    Ultimately I created CouchFoo as I missed the richness of the ActiveRecord API. Whilst I don’t believe my library will be perfect for everyone it has received a lot of good feedback. To paraphrase DHH I didn’t create the perfect framework for everyone else, I created it for me. I only hope that other people find it useful.

    Share and Enjoy: These icons link to social bookmarking sites where readers can share and discover new web pages.
    • Digg
    • del.icio.us
    • Facebook
    • Google
    • Furl
    • Ma.gnolia
    • Pownce
    • Slashdot
    • StumbleUpon

    Tags: , ,

  • Thanks to a lucky draw at dConstruct last year I bagged two free tickets to this years SXSW. I decided to invite Jim along for no other reason that he was likely to be the closest to the event. I’d never been to Texas before and despite hearing many bad reports, word was Austin really wasn’t quite as bad.

    And what a surprise it was - a laid back city with fairly liberal attitudes. Once I got over the English-American language barrier (swap line for queue, register for till and give me for can I have) things seemed to go well. The line up of talks was amazing - Gary Vaynerchuk was awesome although sadly I only caught the last 20 minutes (good video of him here at FOWA), Brian Brushwood did an excellent talk based on his scam school series, James Powderly gave a fascinating talk of his grafitti art and getting detained in China, and there was an extremely useful panel on how to give good presentations. That’s one of the parts I enjoyed the most - the sheer diversity of talks. In addition there were more informal talks where the presenter started off for 10 minutes before opening up to the room - going freelance and becoming productive were two of my favourites in this format. Of course due to the sheer volume of talks many good ones were missed - Larry Lessig seems the prime candidate here. They’re going to make all the talks available for download so I’m looking forward to catching what I missed.

    The talks are only half of Southby though. The night life is great and there’s loads of parties with free drink and beer flowing. These seemed quite hit and miss with the Digg party being awful and the queue to get a signature from Kevin Rose a really quite distressing sight. But for every flop there were some good ones that provided entertainment as wide ranging as Burlesque and live photoshop drawing.

    I met plenty of new interesting people, and bumped into quite a few from England although I know of least two out there who I didn’t bump into all week. Other highlights included the weather, free wifi everywhere, a film called Burma VJ we randomly caught and England destroy France in the Rugby. More random things included drive through banks and a gig featuring a hip-hop group I strangely enjoyed. And the downers? Well I can’t finish without digging just how awful the all american diet is (suprisingly I didn’t want my meal in a sea of melted cheese but gee thanks). Overall though a great experience and well worth it if you’ve never made the trip.

    Share and Enjoy: These icons link to social bookmarking sites where readers can share and discover new web pages.
    • Digg
    • del.icio.us
    • Facebook
    • Google
    • Furl
    • Ma.gnolia
    • Pownce
    • Slashdot
    • StumbleUpon

    Tags:

  • ActiveRecord allows you to serialize objects into text columns through YAML. This seems useful but in my experience is under-used. One of the primary reasons for this is it’s not possible to use the data that the object encapsulates without the ruby model. For example it’s not possible to find on the contents of that object or for that matter, modify the object with languages that lack YAML support. With CouchDB all data is stored in JSON so this is not an issue.

    The project I wrote CouchFoo for used complex ACLs and I wanted to encapsulate this all in an object rather than use several many-many relationships and construct an ACL object based on their contents. So how do you this with CouchFoo? Simple, any object can be assigned as a property in a CouchFoo model as long as it has a .to_json method and a class .from_json method. The methods do what you’d expect, for example:

    
    class DataObjectAttributeList
    
      attr_accessor :attributes
    
      # Constructs the object from JSON
      def self.from_json(json)
        DataObjectAttributeList.new(json)
      end
    
      # Converts the object to JSON
      def to_json
        @attributes
      end
    
      def initialize(initials = {}, *args)
        @attributes = initials
      end
    

    This is just a simple example storing a hash but the structure could be as complex as you’d like. In the future I plan to add inline associations to CouchFoo, so rather than have a one-to-many association where the many are accessed via a second database query you could have the objects stored as part of the parent contents. Performance wise, this is normally much more efficient (although not in all situations - eg heavy write and low read).

    Overall, this becomes a very addictive way of developing and in the same way you start to question whether you need a relational database, you start to question whether you should store associated objects inline or separately.

    Share and Enjoy: These icons link to social bookmarking sites where readers can share and discover new web pages.
    • Digg
    • del.icio.us
    • Facebook
    • Google
    • Furl
    • Ma.gnolia
    • Pownce
    • Slashdot
    • StumbleUpon

    Tags: ,

  • CouchDB is an excellent database, designed especially for distributed applications. To quote the official site site:

    Apache CouchDB is a distributed, fault-tolerant and schema-free document-oriented database accessible via a RESTful HTTP/JSON API. Among other features, it provides robust, incremental replication with bi-directional conflict detection and resolution, and is queryable and indexable using a table-oriented view engine with JavaScript acting as the default view definition language.

    along with the knowledge it’s written in Erlang, you know it’s going to go be a winner in the future.

    CouchDB logo For one of my current freelance projects I needed to store data in a document fashion - ie unstructured. This made CouchDB an ideal candidate. There were several ruby gems available: CouchPotato, CouchREST, ActiveCouch and RelaxDB gems. Each offered its own benefits and own challenges. After hacking with each I couldn’t get a library was happy with. So I started with ActiveRecord and modified it to work with CouchDB. And so CouchFoo was born.

    In the end I ended up with a gem that mirrors ActiveRecord in all but a few minor places. In particular:

    • CouchDB is schema free so property defintions for the document are defined in the model (like DataMapper)
    • :select, :joins, :having, :group, :from and :lock are not available on find or associations as they don’t apply (locking is handled as conflict resolution at insertion time)
    • :conditions can only accept a hash and not an array or SQL. For example :conditions => {:user_name => “Georgio_1999″}
    • :offset is less efficient in CouchDB - there’s more on this in the rdoc
    • :order is applied after results are retrieved from the database. Therefore :order cannot be used with :limit without a new option :use_key. This is explained fully in the quick start guide and CouchFoo#find documentation
    • :include isn’t implemented yet but the finders and associations still accept the option so you won’t need to make any code changes
    • By default results are ordered by document key. The key uses a UUID scheme so these don’t auto-increment and are likely to come out in a different order to insertion. default_sort can be used on a model to sort by create date by default and overcome this
    • validates_uniqueness_of has had the :case_sensitive option removed
    • Because there’s no SQL there’s no SQL finder methods
    • Timezones, aggregations and fixtures are not yet implemented
    • The price of index updating is paid when next accessing the index rather than the point of insertion. This can be more efficient or less depending on your application. It may make sense to use an external process to do the updating for you - see CouchFoo#find for more on this
    • On that note, occasional compacting of CouchDB is required to recover space from old versions of documents and keep performance high. This can be kicked off in several ways (see quick start guide)

    The RDoc for the gem contains more details on each of these differences, new features that I added, a quick start guide and additional areas of responsibility to think about when using CouchDB (in particular performance).

    As a quick overview, basic operations are the same as ActiveRecord:

    
    class Address < CouchFoo::Base
    property :number, Integer
    property :street, String
    property :postcode # Any generic type is fine as long as .to_json can be called on it
    end
    
    
    address1 = Address.create(:number => 3, :street => "My Street", :postcode => "secret") # Create address
    address2 = Address.create(:number => 27, :street => "Another Street", :postcode => "secret")
    Address.all # = [address1, address2] or maybe [address2, address2] depending on key generation
    Address.first    # = address1 or address2 depending on keys so probably isn't as expected
    Address.find_by_street("My Street") # = address1
    

    As key generation is through a UUID scheme, the order can’t be predicted. However you can order the results by default:

    
    class Address < CouchFoo::Base
    property :number, Integer
    property :street, String
    property :postcode # Any generic type is fine as long as .to_json can be called on it
    property :created_at, DateTime
    
    default_sort :created_at
    end
    
    
    Address.all # = [address1, address2]
    Address.first    # = address1 or address2, sorting is applied after results
    Address.first(:use_key => :created_at) # = address1 but at the price of creating a new index
    

    Note that there’s an optimisation that will order results by created_at if there are no conditions so in the above case, the default_sort wasn’t required. However when using with conditions it will be required so it makes sense to use at all times.

    Conditions work slightly differently:

    
    Address.find(:all, :conditions {:street => "My Street"}) # = address1, creates index on :street
    Address.find(:all, :conditions {:created_at => "sometime"}) # Uses same index as :use_key => :created_at
    Address.find(:all, :use_key => :street, :startkey => 'p') # All streets from p in alphabet, reuses the index created 2 lines up
    

    As well as providing support for people using relational databases, CouchFoo attempts to provide a library for those wanting to use CouchDB as a document-orientated database:

    
    class Document < CouchFoo::Base
    property :number, Integer
    property :street, String
    
    view :number_ordered, "function(doc) {emit([doc.number , doc.street], doc); }", nil, :descending => true
    end
    
    
    Document.number_ordered(:limit => 75) # Will get the last 75 documents in the database ordered by number, street attributes
    

    Associations work as expected but you must to remember to add the properties required for an association (we’ll make this automatic soon):

    
    class House < CouchFoo::Base
    has_many :windows
    end
    
    class Window < CouchFoo::Base
    property :house_id, String
    belongs_to :house
    end
    

    There’s a few bits left to tidy up (as noted in the readme) but generally speaking it’s now ready for use by others. Grab it on github and feel free to fork and send me pull requests.

    And now to do something I’ve not being doing a lot of lately, spend some more time on the Couch…

    Share and Enjoy: These icons link to social bookmarking sites where readers can share and discover new web pages.
    • Digg
    • del.icio.us
    • Facebook
    • Google
    • Furl
    • Ma.gnolia
    • Pownce
    • Slashdot
    • StumbleUpon

    Tags: , , ,

  • With the recent announcement that Rails and MERB will merge and my preference for DataMapper I decided to plug datamapper into rails for my next freelance project. The theory goes this should make the upgrade path to Rails 3 a lot simpler!

    It’s currently possible to use Datamapper with Rails, heck even DHH himself commented so, but it’s not quite easy as using ActiveRecord. After a quick Google I only ran into question of how to do it, no howto guide. So I set out to make mine own - it really was quite simple in the end:

    
    sudo gem install addressable data_objects do_mysql
    # do_mysql can be changed for do_postgres or do_sqlite3 as appropriate
    sudo gem install dm-core dm-more
    

    In the dm-more github repos there’s a folder called rails_datamapper which is a plugin for rails to add datamapper support. This doesn’t install with the dm-more gem so it’s a case of cloning the git repository and copy the folder to your rails project:

    
    git clone git://github.com/sam/dm-more.git
    cp -R dm-more/rails_datamapper /vendor/plugins
    

    Then edit your project environment.rb file and add the following lines:

    
    # Load the required gems in the correct order
    config.gem "addressable", :lib => "addressable/uri"
    config.gem "data_objects"
    config.gem "do_mysql"
    config.gem "dm-core"
    
    # Make datamapper load first as some plugins have dependencies on it
    config.plugins = [ :rails_datamapper, :all ]
    
    # Remove ActiveRecord if you no longer need it
    config.frameworks -= [ :active_record ]
    

    The connection to the database will be made by the rails_datamapper plugin using your database.yml configuration file. You’ll need to use a slightly different format for datamapper:

    
    development:
    :repositories:
    :adapter: mysql
    :database: opnli_dev
    

    Or alternately you can specify your own initializer and forgo the rails plugin:

    
    hash = YAML.load(File.new(RAILS_ROOT + "/config/database.yml"))
    DataMapper.setup(:default, hash[RAILS_ENV])
    

    The only real gotcha in using datamapper is some rails plugins assume you’re using ActiveRecord. Hopefully this won’t be the case in the future, but for now you’ll need to get forking!

    Share and Enjoy: These icons link to social bookmarking sites where readers can share and discover new web pages.
    • Digg
    • del.icio.us
    • Facebook
    • Google
    • Furl
    • Ma.gnolia
    • Pownce
    • Slashdot
    • StumbleUpon

    Tags: ,

About me

A picture of me! I’m George Palmer and rowtheboat.com is my personal blog. I’m a freelance developer living and working in London and the picture on the left is, quite obviously, me. more...

Hire me

I develop Rails and Merb applications with a special interest in scaling and cloud computing. Find out more

Badges

I did 100 pressups!