Your ultimate guide to Unit and Integration Testing

Unit testing is great for verifying small pieces of code. But what happens when your module interacts with others, and you need to test that interaction?

Enter Integration Testing.

There are medium and large scale integration tests. One could only exercise a couple of classes and another could spin up a live web server and executes REST calls against your web apis.

Today, I’m going to talk about the latter.

Integration Testing

The allure of full integration tests exists for any application.

Most likely, your team has a set of manual tests they execute on every change, which can be tedious and time consuming. As a developer your job is to automate everything, so it would be nice to automate these as well.

My words of warning: be careful what you try to automate.

My Experience

I began my career at Google as a Software Engineer in Test. As part of that job, I built large scale integration test frameworks.

My first team was Webmaster Tools, a largish web property with a considerable amount of traffic. Essentially, web masters use Webmaster Tools to control how their site appears in Google. You can use the product to remove web pages from search results and to get basic analytics about what search terms your site is ranking for.

On the technology side, Webmaster Tools was a Java server talking to a BigTable database using the Google Closure libraries for JavaScript. We also had a bunch of back-end jobs to generate data and populate the database.

We were on a monthly release cycle. Code freeze would happen a week or two before launch, and QA would begin.

We had one manual tester who would play with the whole site before every single release. He would file bugs, the developers would fix them, and we’d integrate the fixes into the release branch. This process repeated until we signed off on the release, at which point we’d set the new code live, doing a standard rolling release until all of our production jobs were updated.

As a software engineer in test, my first inclination was to try to reduce the amount of manual testing going on, so I began work on a regression test framework using WebDriver (now Selenium 2.0).

Essentially, the framework would spin up a live web server with real data, and then we’d hit the server using a scripted web browser. We’d repeat the test in all the popular browsers, and the whole process was automated and run in parallel using Google’s cool test infrastructure.

The whole thing worked, and I was able to automate a dozen or so of the manual test plans.

Problems

Flakiness

As your test size increases, the number of failure points increases exponentially. The result is increased flakiness, or tests failing for reasons other than a legitimate failure. While our system worked, it was extremely flaky. We were using live servers with a live database, live user accounts, everything.

We had to solve all sorts of issues.

First, each test needed a unique user account to prevent corrupted data when tests ran in parallel. Fortunately Google had a system for checking out and locking accounts.

Next, we were dealing with our QA database, which didn’t have nearly the capacity of our production systems. If the database went down, it was no big deal because the public would not see it, but this was problematic when writing tests. The database went down or experienced high latency several times a week, causing test timeouts and test failures.

Difficulty locating failures

When a test failed, we had trouble locating the cause of the failure. Being integration tests, they’re exercising your entire code base, so the failure could potentially be anywhere. Granted, if you’re testing the home page, your failure will probably be isolated to the code that runs the homepage, but it could also be anywhere in the 5-10 layers that support that page.

This made debugging tests difficult. Finding the bug in unit tests is relatively easy, but finding the bugs in integration took 2-3X as much time.

Originally we ran the integration tests along with all the other unit tests in the presubmit queue. Anytime a developer tried to submit code, they had to run this entire test suite.

But the difficulty in debugging the integration tests, plus their flakiness, eventually forced us to remove them from the submission flow. They were taking too much of our developers’ time and not providing enough legitimate failures.

When can Integration Tests Work?

Ok, so integration tests have problems. But what situations will they actually work in?

1) Using a mock database

One of the issues with our tests on Webmaster Tools was the faulty link between our front-ends and the testing BigTable.

If you can, I would use a fake database. Unfortunately for us, doing that and populating it with fake data would have been too much work, so we never did, but it would have drastically reduced the flakiness of our tests.

If you can’t use a mock database, at least try to run it on the same machine as your web server. Then you won’t have the flaky network link between the two.

2) Avoiding Browser Level Tests

I would avoid browser tests whenever possible, unless they’re as simple as say, opening the homepage and verifying your logo shows up.

Browser based DOM tests are just too complicated. Timing is a huge part of them – waiting for elements to appear or text to show up after you press a button. If anything takes too long, your test will fail.

Differences between browsers add another layer of complexity.

If you’re going to do DOM testing, it’s best to keep as JavaScript unit tests.