L'etat, c'est moi
Mere Complexities sells the consulting and development services of me, Paul Wilson.
Conferences
Archive
Mocks considered harmful
Some time ago I tired of using JMock. I’d come to realise that it was badly affecting my production and test code. I have since seen these problems consistently repeated in heavily mocked code-bases: these include incorrect production code with passing unit tests, hard to change production code, functional object decomposition, too many classes, hard to read tests, mammoth test set-ups.
Concerning mocks, stubs, fakes, test doubles
Martin Fowler’s Mocks aren’t stubs essay clearly explains the distinction between Mocks and Stubs, and the associated “Interaction-based” versus “State-based” testing. He now uses Meszaros’ terminology of “Test Doubles” to mean any object that a test substitutes for a production object and Fakes which provide completely functional but unfit-for-production “Test Doubles”. I am not going to use the Meszaros: I’m uncomfortable with “Double” as being an unobvious term, preferring “dummy object” (I would use “fake” if it hadn’t been taken); I don’t think the distinction between stubs and fakes is important or, in practice, clearly defined.
Mocks and stubs are both “dummy objects”. The distinction is important as they tend to encourage different styles of testing and production code. With mocks you would set up your expectations of its interaction with the production code, then run your test; typically integration with xUnit ensures that the expectations were met. Stubs are simple drop-in replacements for production objects, supplying canned answers to various operations. Mocks invariably come from mocking frameworks, while stubs are typically hand-rolled.
Mock objects come packaged with their own ideology: the plausible belief that classes should be tested in isolation and in terms of the way they interact with their collaborators. This is Interaction Based Testing.
Classically we would use production objects where possible, only stubbing out objects which live on the edge of the system, such as database access code. Tests would be written in terms of what is different after running the code under test, hence the retronym ‘State Based Testing’. Any interaction testing, such as checking parameters sent to the database, is done by recording method calls and arguments and checking the results by old-fashioned xUnit asserts.
Mocks make hard to follow tests
Back in my mockist phase I would often return to some tests, and be surprised at how much difficulty I had following my own test code despite the (specious) elegance of JMock’s fluent interface. The unnatural ordering of actions in a mock tests doesn’t help: mock test inverts the order of cause and effect. In a classic test the order is
- set up the environment
- run the test
- check what happened was what we expected to happen
In a mock test it goes
- set up the environment
- set up the checks for what will happen (expectations)
- run the test
Putting the “assertions” before running the test violates the way my brain expects the universe to operate: I’d normally expect to check what happened after it occured. Also setting up the environment often merge into the test expectations blurring the purpose of the test. Mock tests tend to come with a large amount of setup code, the sheer volume of which hampers readability.
While mock-code is hard to read, it is not correspondingly hard to write, and so abstruse code is encouraged.
Wrong code that passes the tests
Mock tester writers use more mocks than classicists use stubs. This partly because of the mockist philosophy of testing each class in isolation, and also because the existence of a framework attracts its use (while stubs tend to be hand-rolled). The risk with dummy objects is that they can embed incorrect assumptions about how the real code operates. An easy example is production code assuming one based indices or while the dummy assumes zero based.
While test-driving heavily mocked code we often needed to fire up the debugger – we had a bunch of passing unit tests, but the functional tests were failing.
Mocks hamper refactoring
Mock tests concern themselves with the behaviour of the class under test – how its objects interact with its collaborators. A mock test is all about what methods are called, and with what parameters on the mocks. The tests are concerned with implementation rather than results. Common refactorings such as ‘move method’ break these tests, rather than being supported by them.
On a recent project the first stage of a refactoring was often a purge of the test’s mocks so that it can support the changing implementation; as a side-effect we invariably rendered the test more readable and smaller. This does increase the cost of many desirable refactorings above a level consistent with adding functionality. Tests should lower the cost of change, not increase it.
Mocks encourage too many classes
There is something about Mock tests, often aggravated by dependency injection frameworks and a knowledge of GoF design patterns that tends to encourage Functional Object Oriented Programming, aka Treating Your Verbs as Nouns. At the height of my mockism I was once writing some code to convert TIFFs to PDFs by shelling out to tiff2pdf. As far as I remember, the process involved a ‘LocateTiffToConvert’ object, a ‘CreateUniqueFileName’ object, a ‘ShellOutToSystemCommand’ object, a factory for configuring the objects, and I still needed something to coordinate the process1. Each was thoroughly tested, but it’s way too much code for a simple action.
It’s not just me, I frequently see this violation of simplicity in mockist code-bases. It can make things horribly difficult to understand, both by the sheer volume of classes involved and its tendency to obscure the domain with technical concerns
Disclaimer
Nearly all my experience with mock-objects has been with Java and JMock. I have used Ruby’s Flexmock a bit and find it more palatable: I suspect that this is because Flexmock is really at least as much a stubbing framework as a mocking one; also coding in Ruby is always more palatable than Java.
1 I also wasn’t pairing, being the only programmer on the project. Pair programming – may help prevent you coding like a tit.



