Subscribe

AB Testing with Google Analytics

12

Written on January 12, 2010 by george

I love AB testing. I think it is either related to the fact I did a year of Maths at university before switching to Computer Science or because human psychology fascinates me. Either way when I launched 5ft Shelf I was keen to test lots. First on the agenda was the default shelf view. For those not familiar with the site you can view a shelf of books in one of two ways – cover view or spine view as we call it on the site (see screenshots – click to enlarge). Whilst my designer was sure that the cover view would be better I knew the target audience wasn’t necessarily technical, and besides the cool factor of a 3D book shelf would capture attention. It was going to be an interesting AB test.

Cover view preview

The spine view

Traditionally when doing AB testing I’ve added the logic to the rails model (it doesn’t really matter if you’re not a rails developer, the point is it was in the application database). This had a few disadvantages. First up it didn’t keep code or the database as clean as it could be in places. Secondly tools had to be built to then analyse the data and thirdly it couldn’t be easily plotted against relevant variables such as those stored in analytics software.

So when Google announced custom variable support in Analytics I sat up and paid attention. By passing through my AB test number to Google Analytics I can do all the reporting associated with AB testing outside my application database and report against more complex metrics that I don’t store such as bounce rate. Perfect, so let’s get started:

The first step is to assign each user to your site a unique AB test number. I went for a random number in the range 1 to 120. Why so large? Well most AB test I perform are just 2 or 3 options. Depending on traffic you can go much larger but to be honest it becomes harder to understood the factors that lead to the users choice if you do that. Anyway 120 is 2*3*4*5 which means you can have 2, 3, 4 or 5 options (or any product of these numbers – eg 6,8…). This gives plenty of options for the future. So for every visitor that comes to 5ft Shelf they get a random number and this remains with them (unless they clear out their cookies).

I then pass this number through to Google Analytics via a custom variable. For example:

var gaJsHost = (("https:" == document.location.protocol) ? "https://ssl." : "http://www.");
    document.write(unescape("%3Cscript src='" + gaJsHost + "google-analytics.com/ga.js' type='text/javascript'%3E%3C/script%3E"));

    try {
    var pageTracker = _gat._getTracker("UA-11028671-1");
    pageTracker._setCustomVar(1, "shelf_view", "USERS_AB_NUMER", 1);
    pageTracker._trackPageview();
    } catch(err) {}

The USERS_AB_NUMBER needs replacing with the users actual ab number. In ruby I used:

@site_preferences.ab_test_number.modulo(2)

as I was only testing two options. You can substitute this code as appropriate for the number of AB choices you have (and to your language of choice if not using Ruby).

A few notable things:

  • The first argument in the custom variable index number. Google analytics provides you with 5 of these so you can feasibly track upto five AB tests at once.
  • The second argument is the name. I use this to identify the name of the AB test for easy identification in Analytics at a later date
  • The third argument is the value we want to store against the name. I used the ab number modulo 2 as that is what is used to display the different options in the view of the application. If you use more complex logic you should substitute it here. You’re not limited to numbers though – you can also store strings
  • The last argument is the the scope of the variable. For AB testing you’ll nearly always want 1 for visitor level. More on this can be found in the Analytics help centre if you need it

Note: you can just pass through the AB test number but you will have to combine each number’s entries into 2 sets in this example. This is a real pain as one set will be 1,3,5,..119 the next 2,4,6,..120, so unless you like maths best let your application do the logic and keep the data in analytics simple.

Now we’ve passed the data through to Google Analytics we need to make application specific logic in your views to render the different AB tests (again substitute for your test setup and language):

  if @site_preferences.ab_test_number.modulo(2) == 0
    # Render option 1
  else
    # Render option 2
  end

Custom report in Google AnalyticsFinally we need to create a custom report in Google Analytics. Login and click ‘Custom Reporting’ down the left hand side. Then click ‘Manage Custom Reports’ underneath and then ‘Create new report’. You can then drag the metrics which you wish to measure along the top (think what would constitute as a success in your test – longer page views, lower bounce rate etc) and the dimensions down below (you’ll probably only want the Custom Variables here). Make sure you get the Custom Variable values not the keys. The screenshot on the left should show you how it looks (click to increase size).

Bounce Rate by Shelf ViewThen it’s just a case of waiting however long you want to run your test case for and logging into analytics to see the results. You’ll hopefully end up with a chart looking something like the one shown on the right (click to increase size). Note here the real data you’re after is not in the graph but the table below.

So for those of you that asked that’s why 5ft Shelf switched to spine view as the default, there’s your answer :D Happy AB testing folks.

Share and Enjoy:
  • Digg
  • del.icio.us
  • Facebook
  • Google Bookmarks
  • Slashdot
  • StumbleUpon
  • LinkedIn

5ft Shelf

5

Written on October 28, 2009 by george

For the last six months I’ve been alternating between freelancing and working on my own personal projects.  It’s one of the main reasons I went freelance, so it was great to finally make the time to do it.  In particular though, there was a itch I really wanted to scratch.  I first learnt of the Harvard Classics a few years ago and since then the idea has really struck a chord with me – a sort of popular list of books that one should read to have a liberal education.  As I read more about the collection of books I became more interested in what the modern equivalent would be – what are the quintessential books one should read 100 years later?

Today I can announce 5ft Shelf – a site to do just that:

In 1909 Dr Eliot, then President of Harvard University, claimed a liberal education could be achieved by reading a collection of books that would total no more than 5ft in width. A local publisher challenged him to name them and he responded with what became known as the Harvard Classics.

Shifting forward 100 years and into the age of the internet, we’re trying to find out what the modern equivalent would be.  A lot has happened in the last century and we felt it would be impossible to get a fair representation of people’s interests without introducing two quintessential formats for modern living—music albums and movies.

Rather than use a single authoritative source for the modern shelf, we’ve taken the opposite approach by allowing users to create their own shelf.  We then combine all the users’ shelves to find the most popular items and create our top 5ft shelf (”the ultimate shelf”).

The sites offers a little more than the introduction eludes to.  Amongst other features, mini-shelves contain the most popular items in a given subject area and the recommendation algorithm makes personalised recommendations based on the items on your shelf.  For more details read the 5ft Shelf blog Going Live post

As with all projects there has been big highs and hard lows but overall it has been great fun and scratching a personal itch was particularly satisfying.  On a more technical / agile development note I intend to publish two separate blog posts on the results of my endeavors.  The first will be on Getting Real and my experiences using the 37 signals development approach, along with some productivity experiments I performed myself during the time.  The second will be on how long it takes to get a startup to market.  I’m used to tracking time spent on projects for clients so I did the same for this project and found some interesting results.

Look for both blog posts in the coming weeks and in the meantime please do check out 5ft Shelf and pass it along to anyone who you think might be interested.

Share and Enjoy:
  • Digg
  • del.icio.us
  • Facebook
  • Google Bookmarks
  • Slashdot
  • StumbleUpon
  • LinkedIn

CouchDB and ORMs

7

Written on March 29, 2009 by george

Alex did a good introduction talk to CouchDB at Scotland on Rails. Towards the end of the talk he did an overview of the current ruby plugins/gems available for interfacing with CouchDB, one of which was my own CouchFoo. Alex’s opinion was that any ORM for CouchDB should be as thin as possible just wrapping the Ruby to JSON object translation. I raised my opinion in the question section at the end by saying that I didn’t agree and thought the ORM should map the level of functionality available in ActiveRecord. This sparked a debate both in the talk and via Twitter of the best approach for an ORM for CouchDB to take. As a result I agreed to write this blog post to outline my views.

CouchDB is a document orientated database with a HTTP interface amongst other features. When I first started using it I played with the database a lot via simple interactions through CURL. In the same way I feel it is important to know SQL before using any higher level API to store and retrieve objects in a relational database, I feel it is important to understand how CouchDB works before using a library to interact with it. As with most areas of computing you will find a range of opinions over what level you interact with the database – there are the purists who like to write SQL queries for each database query performed and those who are willing to sacrifice a bit of performance (maybe not having the optimum query run each time) for the time efficiencies realized whilst developing. I align quite well with the Rails mantra on this one – I’m willing to sacrifice perfect SQL each time for the efficiency gains made whilst developing. Part of Alex’s argument was that you should be as close to the database as possible because the Ruby to JSON conversion is much less than the Ruby to SQL conversion. Whilst I don’t disagree that it’s important to know how CouchDB works, I do disagree on the level at which any Ruby library should sit. I’m happy to pay a small price in terms of extra ruby code executed because I want as clean as DSL as possible.

Whilst developing CouchDB I tried all the existing ruby libraries and as I worked through them I ran into several issues. After using ActiveRecord’s save and find methods it was particularly annoying to use a library that used different method names for the same conceptual operations (eg get instead of find). This wasn’t a major issue of course I just forked the library and made changes. But as time went on there were features that I missed from ActiveRecord. Validations, callbacks, finders and associations were the prime contenders. Then dynamic finders and named scopes got added to the list. In the end changing the existing libraries became so much work I decided to start with ActiveRecord and work from there.

Of the features in ActiveRecord Associations are perhaps the most controversial on whether they should apply to Document orientated databases or not. The argument goes that if you’re trying to use associations you don’t understand how CouchDB should be used. I disagree on this point – a simple counter argument is presented by having a document that allows comments. Those comments could be stored inline in the document itself or in separate documents that have a reference to their parent. This is association whichever way you look at it. Which approach you decide to use will depend on your application and the characteristics of it. Incidentally Alex’s gem did a great job of this letting the user specify in the association whether they wanted the object stored inline or not. This has since been removed from his gem but is something that’s definitely on the TODO list for CouchFoo.

For me CouchDB lends itself well to two distinct domains. Firstly domains where documents are used – that is an object where the fields that are stored to the database change depending on the object. Secondly domains where you wish to take advantage of some of CouchDB’s features not present (or poorly implemented) in relational databases – a HTTP interface, fantastic scaling ability due to bi-directional replication, and schema free nature (see this excellent article on friendfeed experience with MySQL) are just a few that spring to mind. People may use CouchDB for the second set of criteria even though their database design could be considered quite structured, and I fully expect this group of people to rise as CouchDB reaches 1.0. However that wasn’t why I wrote CouchFoo, my project fell into the first domain. Whilst I provided a way to use ActiveRecord’s higher level API I also provided access to a database object that allows simple storage and retrieval of documents by id. If that is all the functionality you require then I would expect CouchREST would be a better choice. However I believe in reality you will quickly find you need to add validations to a field, or maybe add an association or two. And as soon as you start on that slope I believe CouchFoo to be a better choice.

Ultimately I created CouchFoo as I missed the richness of the ActiveRecord API. Whilst I don’t believe my library will be perfect for everyone it has received a lot of good feedback. To paraphrase DHH I didn’t create the perfect framework for everyone else, I created it for me. I only hope that other people find it useful.

Share and Enjoy:
  • Digg
  • del.icio.us
  • Facebook
  • Google Bookmarks
  • Slashdot
  • StumbleUpon
  • LinkedIn

SXSW

2

Written on March 21, 2009 by george

Thanks to a lucky draw at dConstruct last year I bagged two free tickets to this years SXSW. I decided to invite Jim along for no other reason that he was likely to be the closest to the event. I’d never been to Texas before and despite hearing many bad reports, word was Austin really wasn’t quite as bad.

And what a surprise it was – a laid back city with fairly liberal attitudes. Once I got over the English-American language barrier (swap line for queue, register for till and give me for can I have) things seemed to go well. The line up of talks was amazing – Gary Vaynerchuk was awesome although sadly I only caught the last 20 minutes (good video of him here at FOWA), Brian Brushwood did an excellent talk based on his scam school series, James Powderly gave a fascinating talk of his grafitti art and getting detained in China, and there was an extremely useful panel on how to give good presentations. That’s one of the parts I enjoyed the most – the sheer diversity of talks. In addition there were more informal talks where the presenter started off for 10 minutes before opening up to the room – going freelance and becoming productive were two of my favourites in this format. Of course due to the sheer volume of talks many good ones were missed – Larry Lessig seems the prime candidate here. They’re going to make all the talks available for download so I’m looking forward to catching what I missed.

The talks are only half of Southby though. The night life is great and there’s loads of parties with free drink and beer flowing. These seemed quite hit and miss with the Digg party being awful and the queue to get a signature from Kevin Rose a really quite distressing sight. But for every flop there were some good ones that provided entertainment as wide ranging as Burlesque and live photoshop drawing.

I met plenty of new interesting people, and bumped into quite a few from England although I know of least two out there who I didn’t bump into all week. Other highlights included the weather, free wifi everywhere, a film called Burma VJ we randomly caught and England destroy France in the Rugby. More random things included drive through banks and a gig featuring a hip-hop group I strangely enjoyed. And the downers? Well I can’t finish without digging just how awful the all american diet is (suprisingly I didn’t want my meal in a sea of melted cheese but gee thanks). Overall though a great experience and well worth it if you’ve never made the trip.

Share and Enjoy:
  • Digg
  • del.icio.us
  • Facebook
  • Google Bookmarks
  • Slashdot
  • StumbleUpon
  • LinkedIn

Using objects in models (with CouchFoo)

0

Written on March 21, 2009 by george

ActiveRecord allows you to serialize objects into text columns through YAML. This seems useful but in my experience is under-used. One of the primary reasons for this is it’s not possible to use the data that the object encapsulates without the ruby model. For example it’s not possible to find on the contents of that object or for that matter, modify the object with languages that lack YAML support. With CouchDB all data is stored in JSON so this is not an issue.

The project I wrote CouchFoo for used complex ACLs and I wanted to encapsulate this all in an object rather than use several many-many relationships and construct an ACL object based on their contents. So how do you this with CouchFoo? Simple, any object can be assigned as a property in a CouchFoo model as long as it has a .to_json method and a class .from_json method. The methods do what you’d expect, for example:

class DataObjectAttributeList

  attr_accessor :attributes

  # Constructs the object from JSON
  def self.from_json(json)
    DataObjectAttributeList.new(json)
  end

  # Converts the object to JSON
  def to_json
    @attributes
  end

  def initialize(initials = {}, *args)
    @attributes = initials
  end

This is just a simple example storing a hash but the structure could be as complex as you’d like. In the future I plan to add inline associations to CouchFoo, so rather than have a one-to-many association where the many are accessed via a second database query you could have the objects stored as part of the parent contents. Performance wise, this is normally much more efficient (although not in all situations – eg heavy write and low read).

Overall, this becomes a very addictive way of developing and in the same way you start to question whether you need a relational database, you start to question whether you should store associated objects inline or separately.

Share and Enjoy:
  • Digg
  • del.icio.us
  • Facebook
  • Google Bookmarks
  • Slashdot
  • StumbleUpon
  • LinkedIn