07.08.08
Accept The Facts: Your Test Suite is a Roach Motel
Adam @ Heroku had a recent blog post where he discussed his recent experience with a bug in rush.
“A bug is a situation where the code has been specified to behave one way, and instead it behaves another. It does not include things that users or developers may like the software to do, but that it is not specified to do currently.”
It’s not clear to me if Adam is arguing then that he hit a bug in his product code or spec.
I would argue the bug was not in the product code. Using Adam’s words - since the failing line had no spec it had “no functionality that you can rely on.” This is what he observed.
The code was correct as per the spec - the bug was the missing spec.
On that specific line of code Adam was unintentionally practicing HDD (Hope Driven Development).
Adam makes good points about the benefits of BDD/TDD and the importance of using tools like rspec.
Reading Adam’s experience brought out a rant I let fly once in a while. This rant isn’t about Adam or his recent experience.
It’s about a trend in software development that has been grinding on me for years.
We distrust product code while blindly trusting test code.
That is a huge problem.
The “blindly trusting test code” part. Go ahead and distrust product code all day long. That’s what it’s there for.
Of the two code bases, product and test, only one of them has any test coverage- and yet it’s the other that we treat as golden.
Using BDD/TDD this makes no sense. We know (in the “deep in my heart I truly believe” sense) that BDD/TDD reduces the number of bugs in our product code. It’s a core belief. We wouldn’t do it otherwise.
I have a few other core beliefs:
- You can’t find bugs with missing specs
- Stumbling across a bug is dumb luck (See: Hope Driven Development).
- What is the process for identifying missing specs? Does using BDD/TDD suddenly make you smarter or more creative? Are you able to divine user scenarios you didn’t know of before? Maybe a few - but generally we all still live within our same limitations of vision, insight and experience that we had before.
- You won’t find the right bugs with wrong or broken specs.
- “Duh”
- BDD/TDD practitioners have more test bugs than product bugs
- Assertion: BDD/TDD advocates write twice as much test code as product code
- Assertion: Everyone’s “defects per LOC” count remains individually consistent regardless of whether they are writing product or test code.
- 100% code coverage doesn’t mean that much
- Without code coverage you know you have missing tests. But just because the code succeeds in one scenario with one set of data in a single story doesn’t mean anything beyond exactly that. 100% coverage is not the goal - it’s the starting point.
Your test suite is a roach motel. It’s filled with bugs and there isn’t much any of us can do about it until we change the way we approach specs.
Change is good!
When we move to BDD/TDD we are shifting a portion of our HDD from the product code to the test code. It turns out that in doing so we also shift a good portion of the bugs from the product to the tests as well - that is great! Keep on doing that.
But we need to stop putting so much faith in our specs and start thinking about how we can improve them.
We need to take a hard look at where specs come from and find a better way.
User stories and scenarios, customer requirements and our imaginations are good - but they are bound by our own personal (or pair, or team) limitations.
There’s Patterns in Them-Thar Hills
Take debit/credit as an example. This is a very common pattern used in almost every non-trivial system.
The RSpec homepage uses it as their Story example (banking domain). Sites like Digg use it for digging/burying. Shopping carts use it to add and remove items from carts.
Any time you want to allow a balanced change (accounts stay balanced throughout the transfers, burying and digging have balanced effects and prevent duplication, etc) you are in this pattern.
If 10 people sat down they would come up with 10 different set of closely related specs. Some parts would be domain specific to banking or social networking - but some would be straight-up debit/credit logic.
Some of the specs would be wrong - forget about those.
Some of the debit/credit specs would be good. Some would be very good. They would not just check for happy-path balancing but also invalid sources and targets (transferring to an invalid account or voting on a non-public post), they would spec failure rollback scenarios (transactional behaviors), they would check auditing. There would be checks for a single user in multiple sessions performing simultaneous operations.
The union of the good is approaching the ideal debit/credit spec pattern.
In the process of approaching the ideal the spec would become opinionated about your debit/credit design (plugins, anyone?)
Let’s capture that ideal.
Let’s tell people about it.
Let’s make it easy to discover and use.
Or not.