Why Bother With Cucumber Testing?

The disadvantages of using Cucumber and its widespread use as a poor man's integration test.

I recently heard an account of a web development project from both the point of view of the consultancy doing the work and the client. A member of the consultancy told me how they educated a technologically backward client about agile processes. Although this might be tedious for the client initially, with time the client would appreciate their wisdom. Glowing business referrals would follow forever more.

A few days later I bumped into the client:

“[X] were a nightmare to deal with. They wouldn’t let me have what I wanted, and they wasted my time arguing over petty details. Although they did the job well, they had no business sense. I can’t see them lasting much longer.”

The consultancy screwed up: Amongst other things, they pushed the client to use a process that wasn’t appropriate. In particular, they gave the client Cucumber feature files to read and approve, even though the client didn’t give a damn. In the client’s words “all we wanted was a website”. (It was actually a web app but most non-technical people don’t draw a distinction). The client just wanted it “to work”.

Imagine You Are The Client

Step out of your programmer skin for a moment and pretend you are a busy business owner while you read the following:

Feature: In order to let customers organise their information across themes on various pages.  
         As an administrator of a micro-site  
         I want to be able to add subpages  
Scenario: Adding a subpage  
  Given I am logged in  
  Given[sic] a micro-site with a home page  
  When I press "Add subpage"  
  And I fill in "Title" with "Gallery"  
  And I press "Ok"  
  Then I should see a document called "Gallery"  

Scenario, but not feature description taken from You’re Cuking It Wrong. (Note that Jonas doesn’t give his clients Cukes to read.)

According to Jonas Nicklas at E-labs, the above is an example of acceptable Cucumber style, written in the language of stake-holders. As a past consumer of development services, I have to disagree with him. The appropriate level of detail here is this:

  I can add subpages to my micro-site.

Any more is superfluous. If it takes you ten lines to communicate the idea of adding subpages, then you’ve wasted my time. I’m not alone in thinking this. BDD expert Elizabeth Keogh tells us:

“If your scenario starts with ‘When the user enters ‘Smurf’ into ‘Search’ text box…’ then that’s far too low-level. However, even “When the user adds ‘Smurf’ to his basket, then goes to the checkout, then pays for the goods” is also too low-level. You’re looking for something like, ‘When the user buys a Smurf.’”

If you take this view, then the overwhelming majority of the Rails community have been using Cucumber incorrectly. Despite believing otherwise, most programmers never wrote a single acceptance test; instead they wrote integration tests using the Cucumber syntax.

This is a damning claim and so I offer evidence. In my experiences both as a Ruby contractor and as an employer of programmers, most feature files I’ve seen are composed of the web steps included with Cucumber by default. These web steps are integration tests in disguise. Cucumber steps such as “When I fill in ‘search’ with ‘Smurf’” are rampant in the Rails community, despite their position at the most fault-worthy level in Elizabeth Keogh’s schema.

My own anecotal experience isn’t enough so I did some research. I browsed through the feature files in six major open source Rails projects, including Spree, Radiant and Diaspora. All bar one, Tracks, wrote their feature files near exclusively in web steps. In effect they wrote integration tests using the Cucumber syntax.

The Make-Believe Analyst

Let’s consider the above feature file again, imagining that you are the client. You run a business and so are conscious of costs. You might reasonably ask what’s with all this ‘in order to’ stuff? Why is this developer playing make-believe analyst? Doesn’t he know that you’ve already communicated that you’ve determined the business value when you approved development? And, most importantly of all, is the developer charging you for this work?

The Cucumber way says you sit with your client and determine, feature by feature, the business value that each piece of functionality serves. That’s cool, but it isn’t a realistic job description for many programmers. High level analysts and consultants might sometimes do this, but the likelihood is that you aren’t acting as a high level consultant or analyst on your current project, and so Cucumber is inappropriate. Not only this, but using Cucumber is downright wasteful considering its cost.

The Cost of Using Cucumber

1. Cucumber breaks text editors

Text editors, like VIM, multiply productivity. Auto-completion eases the use and reuse of descriptive method names. Compilation checks prior to saving catch syntax errors before they cause harm. The editor’s ability to understand the signature of a method makes tasks such as finding method definitions and uses simple.

Cucumber’s steps are method names written using the Gherkin syntax, and this unique syntax breaks text editors. Steps contain white space, include their parameters at non-standard locations (When “john@gmail.com” has “4” unsent messages), and use regular expressions for pattern dispatch (“and or”). Text editors are not adapted to deal with this, and so auto-complete, search and many other features break, damaging productivity.

2. Cucumber Requires Maintenance of a Second Testing Environment

Anyone thinking of using Cucumber for acceptance tests most likely already unit tests using something like Test::Unit or Rspec. When a project’s complexity grows, we organise test suites by placing shared test code into helper methods and eventually into modules shared across many test files. We also categorise tests using tagging systems, use tools like Spork to speed up test startup time, and use watchr or autotest to run tests automatically. We add gems to the test environment to make use of advanced helper methods that freeze time, open up emails, or fake web requests.

Cucumber doesn’t like to share. It does not pick up existing test configurations or helper methods. Our taxonomy of tags carries no weight and so we must mirror our existing setup in our new Cucumber world. If we are undisciplined about refactoring we might duplicate code, denormalizing our code base. Even if we are disciplined, we will, at the very least, increase the complexity of our project and thus the scope for error. The parallel worlds of Cucumber and our unit tests then need to be maintained, and the creators of testing gems must now care for two masters. Cucumber takes another toll on the community’s productivity.

3. Cucumber’s Routing Causes Cognitive Strain

Cognitive strain refers to the total weight of facts and rules we must hold in our minds to be productive using a technology. Rails imposes a high cognitive strain. When I began using it I faced a period of difficulty as I untangled the labyrinthine naming patterns of routes, the pluralisation conventions, and the subtle distinctions between various ActiveRecord::Base persistence methods. I’m not saying that cognitive strain is a bad thing; it is often a necessity in powerful tools. I am saying that cognitive strain must be justified.

In Cucumber steps, the mention of a specific web page, for example “the login page”, requires us to map this stake-holder description to a url helper method our app understands.

    when /the login page/

The Rails routing system is complex: in The Rails Way Obie Fernandez suggested that you could squeeze everyone who understands it into one taxi. Cucumber adds an abstraction layer over this already complex system, and this leads to slower development and increased error on larger projects.

A problem I faced when adding new features was that I would unintentionally describe paths using differing natural language descriptions. Perhaps I’d call “the login page” the “the sign in page” one day, and “the log in page” another. When I did this Cucumber complained that it could not find the permuted path name, and so I had to look inside the paths.rb file to remind myself of my previous phrasing. That’s a lot of remembering, and a lot of room for error and slowdown.

I’ve experimented with removing the paths.rb file and writing logic that automatically determines the route based on humanized url helpers. This leads to unacceptable step definitions such as “And then I am on the new user session page”. This is far too close to implementation language to be understandable to the average stakeholder, and so this option must be avoided if we’d like to use Cucumber in the way it was intended.

4. Cucumber’s Organisation Defaults Are Impractical

My favorite feature of Rails is one of its simplest: everything has a place. Mailers go in one folder, models go in another and configurations go somewhere else. Presuming that a project sticks to convention, you can find the source code for any function effortlessly. This brings enormous productivity advantages, especially on projects you inherit from other programmers.

Cucumber asks you to create step definitions for any custom steps you use. The convention is to keep these steps in a file {feature_name}_steps.rb. This convention makes sense on tiny projects, and step definitions are easily found. Problems start once you need to reuse step definitions across many features and you want to keep code DRY. Two things typically happen:

  1. If you’ve got a good memory you find the old step definition and use that method again in your new feature. This can be a terrible move. The shared helper’s presence in promote_post_feature_steps.rb file suggests exclusive connection with that feature. If you later remove the promote_posts feature file you will probably remove its step definition file too, having forgotten that it contained global step definitions. Ideally you should have extracted these shared step definitions to a global step definition file, but realistically this doesn’t always happen. We’ve all got deadlines.

  2. If you don’t have a good memory, or it has been a long time since you last visited the project, you might have forgotten that you previously wrote a step definition for promoting a post. Alternatively you might never have created a step definition but another programmer had done so without you knowing. You write a fresh step definition in taxon_steps.rb, but now your code is no longer DRY. At a later date you might be refactoring the promotion logic and make the changes in promote_post_feature_steps.rb, and then assume you had finished refactoring. The taxon_steps.rb now contains an out of date version of the logic for this step, and this will lead to confusing test failures, since you have no good reason to suspect an issue with the recently refactored promotion logic. Cucumber’s default way of organising features sets you up for these difficulties, and it takes lots of discipline to avoid them.

5. Increased Wordiness

Do you consider one of Ruby’s advantages its brevity compared to other languages such as Java? Do you prefer Sass, Coffeescript and Slim to CSS, Javascript and ERB for the same reason? Then why do you persist in using Cucumber?

Method names in Cucumber need twice as many characters as their plainer Capybara equivalents. Compare:

Given I am on the home page #27 characters
visit root_url #14 characters

Increased wordiness reduces expressiveness and power, and introduces error since more characters means more places a typo can appear.

6. Cucumber Syntactically Discourages Code Reuse

Cucumber users tend to write their step definition in terms of other step definitions. The library creators have implicitly encouraged this by creating simple means for doing so. Here’s an example:

When /^I toggle the full sample on the “(^”.*)” product/ do |product_name|
  steps %Q{
    When I am on the last upload page for “#{product_name}”
    And I follow "full_samples"
    And I press "toggle_full_sample"
    When I refresh the page

This is an abomination, albeit one that is understandable. A developer wants their code to remain DRY but they don’t want the added hassle of encapsulating the step definitions in Ruby methods. So, they reuse their step definitions.

But consider this: When you are inside a step definition you are exclusively in the domain of the programmer. The stake-holder never reads this. With this in mind, wouldn’t it be advantageous to use the brevity, precision, composability, editor support and abstraction tools of the full Ruby programming language? Using Ruby, you will be able to do things like this:

When /^I toggle the full sample on the “(^”.*)” product/ do |product_name|
  admin do
    should_see(sample_path, "features")

The reuse of step definitions within other step definitions, and the composition of higher level steps using the web steps included with Cucumber, are, in my opinion, the main reasons why so many companies attempt integration tests only to abandon the effort within a few months. The tests become too difficult to maintain and get out of sync. The solution to this pain point isn’t simple, but the most sensible course involves use of more powerful tools of abstraction – such as the full Ruby programming language over awkward step definitions.


Cucumber has its uses, principally as a high level analysis tool on large, polyglot projects. That said, few programmers work in this kind of position, and acceptance tests beyond a list of the method names of regular integration tests seems wasteful. Cucumber, as used by the majority of Rails programmers, is no more than a clumsy wrapper over basic integration tests. The differences are not just cosmetic: Cucumber’s syntax is costly, both to the programmer and to the client, whose time and money are wasted. Furthermore, the use of Cucumber in open source software intended for technical users and its use in solopreneur efforts is downright ridiculous. Nevertheless, programmers continue to use Cucumber inappropriately.

Why not admit to yourself that you don’t do acceptance testing and that you do not need it in your projects? Swap Cucumber for pure integration tests using Capybara, and you’ll be surprised by how much more productive you can be.

More Articles: Click here for full archive

Debugging Rails with Operating System Tools

Unix/Command-Line level tools for making web-developers' lives easier– lsof, nslookup, gdb, tcpdump, ctags, ack, strace, and top

A Comprehensive Guide To Debugging Rails

Feedback systems for finding errors in your web application.

Debugging Rails with Logs

Everything you ought to be looking at when interpreting the Rails built-in logs. Also looks at other logs, such as Amazon S3 access logs and scheduler logs.