Improve Your Tests by Breaking the Rules

Posted by Mike Pennisi

Jul 14 2014

For many developers, writing tests is a hassle that would be best put off till tomorrow. For one, nothing can compete with the direct impact of writing great application logic. No user ever shared feedback like, “The UI was really pleasant and the functional tests were well-organized and readable.” There’s not much I can say to change that.

Another common hurdle is that maintaining tests can just be a slog. Whether it’s deciphering the crazy code written by another developer or just trying to avoid writing unmaintainable logic yourself, the open-ended nature of testing has given pause to plenty of programmers.

I’m always looking for ways to streamline testing. Most of the hard work in this area is being done for me; the ecosystem of JavaScript testing tools is getting better on a daily basis. But there are certainly things we can do on an individual basis to make things easier on ourselves. I’ve come to think that certain patterns considered to be “best practices” by many JavaScript developers are irrelevant in test environments. Moreover, they can actually have a negative impact on the experience of writing tests.

In this post, I’d like to make a case for breaking these rules in your test environments. Some of this may seem a little controversial, but stick with me: I have your best interest at heart.

Use Globals

It’s commonly accepted that defining variables on the global scope is a bad practice. Doing so makes tracing dependencies difficult which complicates module loading, and none of this helps when trying to reason about module interactions. When it comes to writing client-side applications, the browser environment is notoriously hostile. Third-party scripts come and go, and any one of them might modify global state in a way your application isn’t prepared to handle. To avoid all these hazards, the common practice tells us that any coupling between modules should be coded explicitly.

One less-widely recognized aspect of this approach is how explicit dependency management requires a lot of additional structure. In JavaScript, you might use AMD modules or you might use CommonJS modules, but in either case, making dependencies explicit calls for a lot of boilerplate code. Over time, I got tired of writing, var assert = require('assert'); in every test file.

One day, I woke up and said, “I’m not gonna take it anymore.” I started declaring testing utilities globally left and right. expect, sinon, webdriver, and yes, even my own testRequire function all found a new home in the global scope.

The variables I’m talking about aren’t dependencies according to the traditional understanding. Sure, your tests “depend” on them in the plain-English sense of the word. But this dependence is “flat”: all your test files depend on these globals directly. This means load order is not something you have to really worry about: the test globals should always be defined before the tests execute. Because of this, I tend to view test utilities as extensions to the environment (rather than test script dependencies). In this sense, a global like assert “feels” a lot like the document object. Neither are part of ECMAScript standard (not yet, anyway), but both expose a well-defined interface in the contexts where they are available. You’re never going to need to optimize your test files for network delivery, and collisions with third-party libraries are also unlikely (no one from another department is likely to throw a <script> tag in a file like test/unit/index.html).

Prefer Synchronous Code

You don’t have to be writing JavaScript for long before you start using callbacks. At first, it’s probably just because the library you’ve chosen insists on providing results that way. Over time though, you come to appreciate how JavaScript’s single-threaded nature makes synchronous operations a real hassle. Whether you’re writing server-side code in Node.js or a GUI frontend for the browser, blocking operations of any sort have a way of making themselves known–everything in your program comes to a standstill until they’ve completed.

So working with callbacks (and/or Promises) makes a lot of sense: your program can continue to operate (be it serving responses or reacting to user interaction) while the long-running work is off-loaded to another process. Unfortunately, the changes required to make synchronous code asynchronous are somewhat awkward. Whether you are creating, binding, and invoking functions as callbacks, or chaining Promise objects, you are always dealing with some additional abstraction that wouldn’t be necessary in purely synchronous code. In most cases, this is a small price to pay for a snappy UI or a scalable server. But in test environments, it’s downright unacceptable.

I’ll level with you: I have not been invited to my family’s Thanksgiving dinner since making this claim. If you’re still reading, then I’m grateful for your open-mindedness. I have good reason to reject asynchronous programming in this case: it all has to do with the behavior of JavaScript test runners.

For the most part*, test runners execute tests in series. Even when you tell one of these runners that the current test is asynchronous, it will wait until the test completes before continuing with the next. The motivation here is that asynchronous operations may generate uncaught exceptions, and if the runner is to have any hope of deciding which test triggered the error, it had better be running just one at a time. This also allows your tests to temporarily “stub” global variables without worrying about interfering with other tests. (Sinon.JS is a great tool for accomplishing this, by the way.)

Test runners that operate in this fashion severely undermine the benefits of asynchronous code. This isn’t to say that you can avoid writing asynchronous tests–your application logic (i.e. the code under test) should never block, and you will be forced to deal with that. But when I’m writing test-specific code, I have no desire to keep things asynchronous. I’ll use fs.readFileSync over fs.readFile if I need to load a fixture file, and I’ll prefer testing tools with a synchronous API over asynchronous alternatives whenever they’re available.

Thanks to ECMAScript 6 Generators, the cognitive overhead of asynchronous logic is going to be decreased still further. When that feature becomes widely available, I will likely adopt it in my tests, but until then, I’m going to stick with sync.

I get the sense you’re not impressed. You started reading this post expecting some really Earth-shattering advice. Well I’ve saved the most heretical suggestion for the very end; I just hope we can still be friends once I’m done.

* – As of this writing most popular testing frameworks (including QUnit, Buster.JS, Mocha, and Jasmine) execute tests serially. A notable exception is Vows.js, which is capable of parallel test execution.

Repeat Yourself

As programmers, we pride ourselves on finding patterns and expressing them concisely. We even have an acronym (DRY – Don’t Repeat Yourself) that espouses the ideal state of code: that it should contain as little duplication as possible.

This is all in good sense. By codifying repeated structures, developers make it easier to re-use patterns that appear across their application. It also helps avoid human error. Every time we have to re-type some tedious piece of logic, we risk introducing a typo that may cause problems later on. Much has been written on this subject; I personally recommend The Pragmatic Programmer for a great (and language-agnostic) discussion of “the evils of duplication.”

The benefits of abstraction apply to test code as well, but it’s possible to have too much of a good thing. Test suites that abstract away the tests/assertions themselves can’t really be considered “declarative”. When pushed far enough to yield procedurally-generated tests, the practice of abstraction loses its value. Here’s why:

Test code is code, so it’s tempting to design, write, and maintain it in the same way we do with application code. But test code differs in one critical regard: it does not perform a service to the end user. Instead, it is a tool intended to assist developers in writing production code. This distinction ought to have some bearing on how we approach testing–our priorities should shift slightly.

While maintainability is still important, our primary responsibility is delivering a product or service. Tests are a critical tool in achieving that goal, but they are not the goal themselves. Given the choice between optimizing some aspect of my application code and some aspect of my test code, I’ll opt for the application code every time.

Readability, on the other hand, is much more important for test code. I personally believe tests should serve as living documentation of your source code. Even if you disagree, other developers will reference your tests when they are investigating bugs. In these scenarios, unraveling the conceptual overhead introduced by test code abstraction is a burden that does not benefit the developer. I would even go so far as to call it “inconsiderate” on the part of the test maintainer, but I may be over-sympathizing with the bug fixer. In any event, procedurally-generated tests impede developers looking to modify the source code.

The way you achieve maintainability is different in a test environment. We don’t (or at least, shouldn’t) spend our days optimizing and refactoring our tests as we do with our application logic. We want to write tests and move on–the best test code is the code we write once and don’t have to touch again. This contradicts how many programmers internalize the notion of “maintainability”. Instead of meaning, “easy to change”, in test contexts, the word “maintainable” ought to mean, “resilient to change”. The best way to achieve this is to express your assertions at as high a level as possible–don’t use unit tests to assert integration concerns, and never write tests against internal state or private APIs.

Alternatively, you could frame the decision of declarative-vs-generated tests as a choice between two classes of bugs: “transcription bugs” (introduced by typos/faulty copy-and-paste operations) and “systemic bugs” (introduced by software design flaws). Code with repeated declarative structures is more likely to suffer from transcription bugs while abstracted code has more room for systemic bugs. Though neither is desirable, transcription bugs are preferable because they tend to make themselves known immediately through syntax errors or reference errors. Systemic bugs, on the other hand, are more likely to yield false positives (either from faulty assertions or outright test omission). Because false positives diminish the authority of tests (tests are meant to be a project’s source of truth, after all), it only seems appropriate to reject the practice of procedural test generation.

Breaking Tradition

The accepted paradigms in JavaScript development have been hard-won over many years of experimentation and debate. I don’t mean to be dismissive of consensus; every one of these best-practices makes a lot of sense for application code. But as with any solution, once you are familiar enough with the problem it solves, you should be willing to re-interpret it in light of your own experience. Even the ideas I’ve presented above should be applied with some nuance, and I’m sure I’ll find completely different ways of thinking in the future. Although we may lapse into complacency from time to time, the process of questioning established practices should never end.

Posted by
Mike Pennisi
on July 14th, 2014

Tagged in

Contact Us

We'd love to hear from you. Get in touch!

Email

hello@bocoup.com