Skip to main content

TinT: Metaphors for Test Types

Testing in the Trenches (TinT) is an occasional series recounting some of the experiences I have had as a Unit Testing evangelist on various projects. Where appropriate, any references to actual details have been sanitized to protect the participants. This post is adapted from a regular email that I sent to the team, to promote and educate about testing. I secretly called these my Unit Test Propaganda Messages.

For a while now, I have been promoting among the team the idea of automated testing, especially Unit Testing, as a means to improve both our lives as developers and our software for our clients.

Recently, I was listening to a Software Engineering Radio podcast, and the speaker mentioned seven different kinds of automated testing that they do in his organization. The speaker was Mike Barker, talking about his work on the LMAX architecture, and the seven kinds of automated testing he named (around the 50-minute mark) were:

  1. Unit Tests, 
  2. Integration Tests, 
  3. Acceptance Tests, 
  4. Performance Tests (End-to-End), 
  5. Performance Tests (Micro-Benchmarks), 
  6. Static Analysis, and 
  7. Database Tests.

It got me thinking about our own environment, practices, and needs. What kinds of testing do we do? What are the benefits of automating them or not? Some of these are the responsibility of developers, others may be more for the QA team. Sometimes there is confusion about what each kind of test is and does, so let's unpack Barker's list. Since he did not go into detail about the definitions of the different types, I will add some context.

And of course other lists of essential kinds of automated tests will include different types in the list, such as GUI Tests, Regression Tests, Input/Output Tests, etc. We will leave them for another day.

To help understand the differences, let's use this metaphor: imagine the system we are building is the human body. We have to assemble all of our code to create the internal structures and sub-systems, and wrap the whole together into the publicly-visible skin, hair, etc.

Unit Testing - this is the kind of test that I promote most vigorously in our team. These are developer-written and developer-maintained tests that validate the functionality at the code level. They test small units of code - a method, a function, a component, a process, object-level behavior and state. They are so-called White-Box tests, meaning the developer can use their knowledge of the inner workings to shape their tests.

Unit Tests do add a little time to the programming task, but they save so much time in so many other ways, which speeds up the overall development process:

  1. they give quick feedback if your code is not working as expected; 
  2. they create localized regression tests to guarantee against broken or changed functionality; 
  3. they document existing and expected behavior;
  4. they provide confidence when refactoring toward a better design. 

A good unit-test suite gives developers confidence that their changes work and that they do not break or change other behavior, and because they are fast and targeted, they should be run by the developers many times a day.

We use jUnit as the framework to build and run our automated unit tests. It is not the only unit-testing framework, but is ideally suited to the task and is widely used in the industry.

In our human-body metaphor, unit tests would verify all of the little activities, such as: does the index finger curl up when you contract the outer two knuckles? what are the limits of the range of motion of the thumb? does the wrist move according to all expected degrees of freedom?

Integration Testing - these tests gather several units and test that they function together through their interfaces as expected and designed. They usually skip the user-interface portion, and test directly the integration of the components that do the work requested by the user. In our code base, too often the different tasks and units are so tangled together that it can be hard to isolate a unit to test it by itself. Some blocks of code depend too closely on the database or the GUI or other classes. As a result, we often quake fearfully in the face of the effort to create unit tests, throw up our hands, and move to a higher level of testing.


Now here is where the lines start to blur. Where I learned about testing, we used terms like Functional Tests to mean more or less Integration tests. More, actually, as their goal was more than just the integration of the components, but the end result of their collaboration.

In our human-body metaphor, integration / functional tests would verify that the fingers and thumb can work together to hold a pencil; or that lungs, jaw, tongue and lips coordinate their activity to form a certain sound.

Acceptance Testing - these tests are based on the Requirements specified for a given sub-system. They are the checks that an end-user does, to sign-off that the fix or feature is what they requested or paid for. Since developers are notorious for interpreting the requirements through our techno-centric lens, these tests double-check that the user is getting what they think, want or need.

I have been on past projects where automated Acceptance Tests were part of the Developer's responsibility, to prove that the requirements were met, although there were usually other people who did further verification before letting the system out the door into the wild.

Also blurring the lines are what some call System Tests. System tests use a full running version of the system, including the UI (unlike Integration or Unit Tests). They may overlap Acceptance tests, but may or may not be tied as tightly to the specified requirements. Or they may test some of the more technical requirements of the system, that do not show up as user-level requirements.

The original team for whom I wrote this material was scheduled to move over the final half of the year toward using Selenium and WebDriver to drive automated testing of the Web side of the system. The QA team and developers will work together to define, create and maintain these tests, which will be somewhere in the spectrum of Functional / Integration / Acceptance tests, exactly where still to be determined.

In our human-body metaphor, Acceptance tests would verify that the assembled body with all of its moving parts can work together to throw a baseball; or stand up from a sitting position.

Performance Testing (both End-to-End and Micro-Benchmarks) - these tests verify that the system meets its performance requirements. Automating them lets us monitor the state of our system easily, with less manual effort, and more regularity. However, setting up and automating a Performance test is much more complex than smaller, simpler unit tests.

An End-to-End one might measure a complete process, such as our current investigation into the speed of the weekly newsletter emails in the upcoming release.

A Micro-benchmark might measure performance against expectations for smaller sub-systems, such as the template processing part of the newsletter emails, or the time to display a data-intensive screen.

In our human-body metaphor, an end-to-end performance test might be that the baseball-throwing from Acceptance testing can be thrown with a speed of at least 80 mph. A Micro-benchmark performance test might be testing that the hand can scoop up and hold a minimum number of peanuts - smaller in scale than an end-to-end test, but meeting or exceeding some defined criteria.

Static Analysis - these are not tests in the same way as the other categories, in that they are not running the code against some requirement. Rather, these are automated runs of code review tools. They can check against team coding standards, or against industry best-practices. Tools such as PMD and CPD do things like scan for empty try/catch/finally blocks or switch statements (possible bugs), unused variables and methods, over-complicated sections, excessively long classes or methods, duplicated code, and more. They can automatically run overnight on the day's changes, and identify places that could be improved, corrected or refactored.

For a lark, I ran PMD to find dead code and CPD to find duplicated code in one core package of our team's code base. The tools found literally hundreds of cases of dead code, unused variables, and copy-and-pasted code in that one package. These so-called "code smells" are places where our system is unnecessarily hard to read, understand and maintain, and are places where bugs could slip in, if they do not exist already.

Static Analysis is a little harder to fit into our human-body metaphor. It looks at the code that constitutes the system, the materials we use in building it. The nails and board sizes if our system were a house. In our human body system, that might mean verifying that the Finger Nail component meets our system's standards of length, growth rate, position, and does not borrow excessively from the cell structures of other sub-systems like cells in the tongue.

Database Tests - Barker does not say anything in the podcast about his automated database tests. Too bad, because I would love to learn more. In our current projects, we do some query analysis toward improving the efficiency of some slower parts of the system, but they are generally manual one-off investigations based on customer complaints. The prospect of automating this process or some database validation checks is intriguing.

Since I can only speculate what Barker meant, and we do not have anything that I would consider an automated Database test, I am not sure how to map it to my human body metaphor. Possibly that sensory input from the nose gets stored in the correct format and location of the brain.

In conclusion, we currently automate Unit tests, have played around with automating Performance tests, and are working toward something closer to Acceptance tests. But using Mr. Barker's inspiring list, there is lots of room for growing our automated testing, with the goals of increasing reliability, performance, features and reducing bugs in our application.

Popular posts from this blog

How to do Git Rebase in Eclipse

This is an abbreviated version of a fuller post about Git Rebase in Eclipse. See the longer one here : One side-effect of merging Git branches is that it leaves a Merge commit. This can create a history view something like: The clutter of parallel lines shows the life spans of those local branches, and extra commits (nine in the above screen-shot, marked by the green arrows icon). Check out this extreme-case history:  http://agentdero.cachefly.net/unethicalblogger.com/images/branch_madness.jpeg Merge Commits show all the gory details of how the code base evolved. For some teams, that’s what they want or need, all the time. Others may find it unnecessarily long and cluttered. They prefer the history to tell the bigger story, and not dwell on tiny details like every trivial Merge-commit. Git Rebase offers us 2 benefits over Git Merge: First, Rebase allows us to clean up a set of local commits before pushing them to the shared, central repository. For ...

Git Reset in Eclipse

Using Git and the Eclipse IDE, you have a series of commits in your branch history, but need to back up to an earlier version. The Git Reset feature is a powerful tool with just a whiff of danger, and is accessible with just a couple clicks in Eclipse. In Eclipse, switch to the History view. In my example it shows a series of 3 changes, 3 separate committed versions of the Person file. After commit 6d5ef3e, the HEAD (shown), Index, and Working Directory all have the same version, Person 3.0.

Scala Collections: A Group of groupBy() Examples

Scala provides a rich Collections API. Let's look at the useful groupBy() function. What does groupBy() do? It takes a collection, assesses each item in that collection against a discriminator function, and returns a Map data structure. Each key in the returned map is a distinct result of the discriminator function, and the key's corresponding value is another collection which contains all elements of the original one that evaluate the same way against the discriminator function. So, for example, here is a collection of Strings: val sports = Seq ("baseball", "ice hockey", "football", "basketball", "110m hurdles", "field hockey") Running it through the Scala interpreter produces this output showing our value's definition: sports: Seq[String] = List(baseball, ice hockey, football, basketball, 110m hurdles, field hockey) We can group those sports names by, say, their first letter. To do so, we need a disc...

Updating Oracle javapath symlinks on Windows

A Java-based application on my Windows 10 machine recently started prompting me to upgrade my version of Java. Since I wanted to control it myself, I declined the app's offer to upgrade for me, and downloaded and installed the latest Java 8 from Oracle. In my case, Java 1.8.0_171, 64-bit version. The upgrade went fine. But when I launched the app, it again said I needed to upgrade. Why was it still looking at the old location? I made the change using Settings, to change the JAVA_HOME environment variable to point to the location of the new upgrade. But no change, the app still insisted that I needed to upgrade. A little research into the app's execution path showed that it was using c:\ProgramData\Oracle\Java\javapath to find Java. When I looked in that folder, I found symbolic links to my old Java installation. Normally, this hidden bit of information gets updated automatically in the upgrade or installation process. I have read of cases where, when downg...

Code Coverage in C#.NET Unit Tests - Setting up OpenCover

The purpose of this post is to be a brain-dump for how we set up and used OpenCover and ReportGenerator command-line tools for code coverage analysis and reporting in our projects. The documentation made some assumptions that took some digging to fully understand, so to save my (and maybe others') time and effort in the future, here are my notes. Our project, which I will call CEP for short, includes a handful of sub-projects within the same solution. They are a mix of Web APIs, ASP MVC applications and Class libraries. For Unit Tests, we chose to write them using the MSTest framework, along with the Moq mocking framework. As the various sub-projects evolved, we needed to know more about the coverage of our automated tests. What classes, methods and instructions had tests exercising them, and what ones did not? Code Coverage tools are conveniently built-in for Visual Studio 2017 Enterprise Edition, but not for our Professional Edition installations. Much less for any Commun...