A Pattern Language for XUnit Testing of Indirect Inputs and Outputs

Gerard Meszaros

PloP2004@gerardmeszaros.com

http://testautomationpatterns.com/TestingIndirectIO.html

Abstract

Automated unit tests (A.K.A. "developer tests") and functional test (A.K.A. "customer tests") are a cornerstone of many agile development methods (such as eXtreme Programming). The availability of automated, self-checking tests allows developers to be much bolder in how they modify existing software. All software components interact with the software around them in two basic ways. The software can be a “server" to other software. That is, it can provide services to other software components through a service interface. It can also be a client in that it uses the service interfaces of other software components. When we test a piece of software as a server we are testing it's "direct inputs and outputs". But when we want to test that the piece of software is interacting correctly with any other software components to which it is a client, we are testing it's "indirect" inputs and outputs. This pattern language describes techniques for testing these indirects inputs and outputs.

Note to Potential Shepherds:

The submitted document is an introductory narrative that refers (via live hyperlinks) to detailed writeups of about 12 patterns specific to Test Doubles (in scope) as well as some other patterns and test smells (out of scope). The author proposes to use a live wiki as the primary means to communicate detailed comments between the shepherd and the author. The author is looking for both format and content validation through the shepherding process. Ideally, the shepherd will have at least some experience writing automated tests using XUnit. (I.e. JUnit, NUnit, VbUnit, RubyUnit, etc..)

Notation:

Underlined phrases in normal font beginning with capital letters are patterns; lowercase are definitions while hyperlinks in italics refer to “test smells”. Hyperlinks beginning and ending with ?-marks are links to items that have not yet been written.

Scope:

This text is the introductory narrative to the Testing Doubles part of a pattern language book on patterns of XUnit test automation. The actual material to be reviewed and shepherded is available on our website at http://testautomationpatterns.com/TestingIndirectIO.html; only patterns in the category http://testautomationpatterns.com/Test Double Patterns.html are in scope”; all others can be ignored. The focus of this submission is only the patterns (not the definitions or smells).

8-Testing Indirect Inputs and Outputs

Revision: 1.32 Date: 2004/04/28 03:35:18

As described in TestAutomationOverview, the SUT interacts with components through both the "front door" and the "back door". That is, software components have both an API (their front door) and make calls to the APIs of other components (their back door). Testing the interactions through the front door is the easier of the two; the test simply acts as though it were the client of the SUT and interacts through the "front door" interface.

Verifying SUT behaviour through the front door appears (at least on the surface) to be straight forward, but how do we verify that the interactions through the back door are correct?

The first question we must answer is "Why do we care?" Assuming that we do care, the next question is "How do we verify it?"

Why do we care about Indirect Inputs?

Calls to depended on components often return objects, values, or even throw exceptions. Many of the execution paths within the SUT are there to deal with these different return values and to handle the various possible exceptions. Leaving these paths untested is an example of Untested Code. These paths can be the hardest to test effectively but they are also among the most likely to lead to failures. In the following example, how can we test that when the timeProvider throws an exception it is handled correctly?

Modified from Ex7 solution of Testing For Developers (Java). Need to replace with code insertion from source:

public String getCurrentTimeAsString() {

Calendar currentTime;

try {

currentTime = timeProvider.getTime();

} catch (TimeProviderException e) {

return e.getMessage();

}

return currentTime.toString()

}

We certainly would rather not have the exception handling code executed for the first time in production. What if it was coded incorrectly? Clearly, it would be highly desirable to have automated tests for such code. The testing challenge is to somehow cause the timeProvider to throw a TimeProviderException so that the error path can be tested. A TimeProviderException is an example of an Indirect Input.

Need a box describing the ATT Network Outage caused by the backwards "if" statement in the overload handling code.

Why do we care about Indirect Outputs?

The concept of encapsulation often directs us to not care about how something is implemented. After all, that is the whole purpose of encapsulation--to alleviate the need for clients of our interface to care about our implementation. When testing, we are trying to verify the implementation precisely so our clients don't have to care about it.

But consider a component that has an API but one which returns nothing or at least nothing that can be used to determine whether it has performed it's function correctly? This is a situation in which you have no choice but to test through the back door. A canonical example of this is a message logging system. Calls to the API of a logger rarely return anything that indicates it did it's job correctly. The only way to determine whether it is working as expected is to interact with it through its back door.

The text shifts to make the logger the SUT instead of the logger client--confusing.

In the case of the logger, it may be sufficient to check that it makes the necessary calls to the file system with the right values. Calls to components the logger depends on and the values passed to those calls are called indirect outputs. They are outputs coming from the SUT but not back to the test where they can be easily verified.

In other cases, the SUT does have visible behaviour that can be verified through the front door but also has some expected "side-effects". In the case of the logger's client, we want to make sure that it calls the logger at the appropriate times with the expected arguments. These calls are an important side-effect that certain stake-holders (the maintenance programmers) will depend on. Leaving these calls untested is an example of Untested Requirement.

Modified from Ex8 of Testing For Developers (C++). Need to replace with code insertion from source:

AirportDto* FlightManagementFacade::createAirport(const char* airportCode,

const char* airportName,

const char* nearbyCity) {

Airport* airport = NULL;

try {

airport = dataAccess->createAirport(airportCode, airportName, nearbyCity);

char* buf = new char[255];

itoa(airport->getId(), buf, 10);

logMessage("CreateFlight", buf); // Wrong action code this BUG???

return new AirportDto(airport);

} catch (FlightBookingException e) {

throw e;

}

If we plan to depend on the information captured by logMessage when maintaining the application in production, how can we ensure that it is correct? Clearly, it is desirable to have automated tests for this functionality.

How do we control Indirect Inputs

Testing with indirect inputs is a bit simpler than testing indirect outputs because the techniques for outputs build on the techniques for inputs. So let's delve into indirect inputs first.

To test the SUT with indirect inputs, we must be able to control the depended-on component well enough to cause it to return every possible kind of return value.

Examples of the kinds of indirect inputs we want to be able to induce include:

return values of methods/functions
values of updatable arguments
exceptions that could be thrown

In many cases, the test can interact with the depended-on component to set up how it will respond to requests. For example, if a component provides data access then it is possible to use Back Door Fixture Setup to insert specific values into a database to cause the component to respond in the desired ways (no items found, one item found, many items found, etc.) In this specific case it is possible to obtain reasonable control of the depended-on component, however in most cases it is not practical or even possible. Reasons why we might not be able to use the real component include:

The real component cannot be caused to make the desired indirect input occur. Only a true software error within the real component would result in the desired input to the SUT.
The real component could be caused to make the input occur but the cost or complexity of doing so would make using it not cost effective.
The real component could be caused to make the input occur but doing so could have unacceptable side effects.

So if the real component can't be used, we have to replace it with one we can control. This replacement can be be done a number of different ways, which is the topic of discussion in Using Test Doubles.

How do we verify Indirect Outputs?

In normal usage, as the SUT is exercised, it interacts naturally with the component(s) upon which it depends. To test the indirect outputs, we must be able to observe the calls that the SUT makes to the API of the depended-on component. And if we need the test to progress beyond that point, we also need to be able to control the values returned (as was discussed in the discussion of indirect inputs .

In many cases, the test can interact with the depended-on component to find out how it has been used. Examples include:

Inspecting the contents of a file that the SUT has written as a way of inspecting the state of the file system.
Interacting directly with the component to query its current state.
Configuring the component so that it reports the changes in state back to the test.

But in many cases, and as we've seen with indirect inputs, it is not practical to use the real component to verify the indirect outputs.

When all else fails, we may need to replace the real component with a test-specific alternative. Reasons why we might need to do this include:

The calls to (or the internal state of) the depended-on component cannot be queried.
The real component can be queried but the cost or complexity of doing so makes using it not cost effective.
The real component can be queried but doing so has unacceptable side effects.

The replacement of the real component can be be done a number of different ways which will be covered in Using Test Doubles.

There are two basic styles of indirect output verification.

Procedural Behavior Verification involves capturing the calls to a depended-on component during SUT execution and then comparing the actual calls with the expected calls using a series of assertions. The expected calls are described by assertions within the "verification" section of the test (or in helper methods called from the test).
Expected Behavior involves building a "behaviour specification" during the fixture setup phase of the test. This is typically done by loading an Active Mock Object with a set of expected procedure call descriptions and installing it into the SUT. During execution of the SUT, the Active Mock Object receives the calls and compares them to the previously defined expected calls (the "behaviour specification".) As the test proceeds, if the Active Mock Object receives an unexpected call, the test fails.

Using Test Doubles

By now you are probably wondering about how to replace those inflexible and uncooperative real components with something that makes it easier to control indirect inputs and to verify indirect outputs.

As we've seen, to test the indirect inputs, we must be able to control the depended-on component well enough to cause it to return every possible kind of return value (valid, invalid, and exception). To test indirect outputs, we need to be able to track the calls the SUT makes to other components.

A Test Double is a type of object that is much more co-operative and let's us write tests the way we want to.

We came up with the name Test Double as the generic name for Dummies, Stubs, and Mock Objects. Feedback requested!

Styles of Test Doubles

A Test Double is any object or component that we install in place of the real component specifically so that we can run a test. Depending on the reason for why we are using it, it can behave in one of three basic ways:

A Dummy Object is an object that replaces the functionality of the real depended-on component in a test for reasons other than verification of indirect inputs and outputs. Typically, it will implement the same or a subset of the functionality of the real depended-on component but in a much simpler way. It may be installed by, but is typically not programmed by, the test. The most common reason is that the real depended-on component is not available yet, is too slow or is not available in the test environment. If the functionality is required to carry out the test, a Dummy Object is a good candidate. ?In-Memory Database Emulation? describes how we dummied out the entire database with hash tables and made our tests run 50 times faster.

A ?Hard-coded Test Double? has all its behavior hard-coded. That is, it would return a hard-coded value when a certain function is called. This is the simplest form of test double.

On the other hand, a Programmable Test Double provides either a Programming Interface, or a Programming Mode that the test uses to program the Test Double with the values to use. This makes the Test Double reusable across many tests. It also makes the test more understandable by making the values used by the Test Double visible within the test thus preventing ?Mystery Guest?.

Programmable Test Doubles come in two basic flavors:

A Passive Mock Object is an object that can be programmed with the ?indirect inputs? (values to return and exceptions to throw when its methods are called) and which quietly records all the calls to its methods. In the verification part of the test, the tests uses the Retrieval Interface of the Passive Mock Object to ask about the calls it received thus allowing the test to perform Procedural Behavior Verification on those calls via a series of assertions.
An Active Mock Object is an object that is programmed with all the Expected Behaviors that it should expect to see from the SUT. An Eager Mock Object fails the test immediately upon receiving an unexpected or incorrectly formed call while a ?Lazy Mock Object? waits until the Final Verification Method is called to compare the actual calls it received with the pre-programmed expected calls.

Like a Passive Mock Object, an Active Mock Object is often programmed with any ?indirect inputs? required to allow the SUT to advance to the point where it would generate the ?indirect outputs?. They make it possible to reuse the logic used to verify the ?indirect outputs?.

Programming the Test Double

Some test doubles need to be programmed as part of setting up the test while others do not. In particular, Passive Mock Objects and Active Mock Objects almost always need programming. By their very nature, Dummy Objects do not need programming because they are not used to verify behaviour and Hard-Coded Test Doubles do not because their behaviour is fixed at the time they are coded. A Passive Mock Object only needs to be programmed with the values to be returned by the methods we expect the SUT to invoke. An Active Mock Object needs to be programmed with the expected names and arguments of all the methods we expect the SUT to invoke on it.

So where should all this programming be done? In FixtureSetup we discuss several alternatives including Inline Setup, Implicit Setup and Delegated Setup. The installation of the test double should be treated just like any other part of fixture setup. We encourage you to choose the fixture setup pattern that leads to the most understandable tests!

Installing The Test Double

Before we exercise the SUT, we need to install any Test Doubles on which our test depends. The normal sequence is to instantiate the double, program it if necessary and then install it into the SUT. We can install the test-specific replacement for the real DOC in any one of the following ways:

?Test Double Constructor Argument? -- Rather than hard-coding the DOC into it's logic, the SUT constructor takes the DOC to be used as an explicit argument passed to it by the test. This may be the main constructor or it may be an alternate constructor. In the latter case, the main constructor should call this constructor passing in the default DOC.
?Test Double as Overridable Attribute? -- Rather than hard-coding the DOC into it's logic, the SUT accesses the DOC through a public attribute (i.e., a variable or property). The test explicitly sets the attribute after instantiating the SUT, thus installing the test-specific one. The SUT may have previously initialized the attribute with the real DOC in its constructor (in which case the test is replacing it) or the SUT may use ?Lazy Initialization? to initialize the attribute in which case the SUT will never bother to install the real DOC.
?Test Double Method Parameter? -- Rather than hard-coding the DOC creation into its logic, the object to be used is passed in as an (extra) paramter to the method being tested. This is the approach advocated in the original paper on Mock Object [Mackinnon]. In this paper, mocks passed as parameters to methods are called "Smart Handlers". This approach works fine when the API of the SUT takes as a parameter the object you need to mock out. However, although Mackinnon et al argue that designing your APIs this way improves the design of the SUT, it is not always possible or practical to pass everything required to each method. Some objects you want to mock out will be attributes of the SUT or ?Singleton's?.
?Mocked Singleton? -- When the SUT uses a singleton by calling a static method on a hard-coded class name, the test can cause the method to return an instance of a mock class by subclassing the Singleton class and initializing the real ?Singleton's? class (i.e., static) variable to hold an instance of the mock singleton object. The returned Test Double may need to be a ?Test Double Subclass? if the type of the variable used to hold the Singleton's sole instance is hard-coded as the singleton's class.
?Configurable Object Factory? -- The SUT may create the DOC using a factory. If the factory is designed so that it can be told what class to instantiate, the test can tell it to create a mock DOC instead of a normal one.
?Pluggable Component Registry? -- When the SUT retrieves a previously created component from a ?Directory? or ?Registry?, the test may install the mock component into the registry before instantiating or exercising the SUT.

This next section probably warrants being turned into a separate chapter.

Building Test Doubles

Want a Job as a Test-Double?

The test automation business is an equal opportunity employer and is willing to take on all sorts of objects as Test Doubles. Here's a short list of the kinds of things that can play the role:

With a ?Separate Test Double? the DOC's interface is re-implemented in a separate, named and therefore reusable class. This is the most general solution and is available to anyone in a class-based programming language.
An ?Inner Test Double? is a mock object implemented as an inner class within the test class. It is available for use by all test methods of the enclosing class, but is not visible from anywhere outside the class. This is a language specific idiom that is available in Java, C#, and C++.
An ?Anonymous Inner Test Double? is a Java mock object implemented as an anonymous inner class. This too is a Java-specific idiom that may not be available in other languages. ?UTMJ? refers to this as a ?Poor House Mock?. ?Anonymous Inner Test Double? are usually implemented as a Hard-Coded Test Double and defined directly in a test method for clarity.
?Self Shunt? -- The test class plays the role of the depended-on component by implementing the needed interface and installing itself into the SUT instead of the depended-on component. This is most commonly used in languages that support interfaces (as in C++, C#, Java) or function pointers/delegates (as in C and C#, respectively).

The Test Double implementation can be coded a number of different ways. In order of decreasing effort and increasing complexity, the common ways are:

A ?Hand-Coded Test Double? is where a mock implementation is built by hand and is stored in the code repository along with the test code. This approach requires the most effort but is common when very simple mocks are required or when no tools exist in your development language. A ?Hand-coded Test Double? is often an ?Inner Test Double? or an ?Anonymous Inner Test Double? because such a mock is crafted for a very specific situation and is not intended for general use.
With ?Static Test Double Generation?, a tool is used to code-generate a mock implementation of a specified interface or class. This class is then managed like any other class, including compiling it and storing it in a source code repository. If the interface implemented by the mock change is changed, then it would be regenerated.
With ?Dynamic Test Double Generation? we use language-based reflection to synthesize the mock implementation of the desired interface at run time. The test then interacts with the dynamically generated mock to program it. In this approach, there is no code to manage since the mocks are created on-the-fly. ?Dynamic Test Double Generation? removes the need to regenerate the mock when the interface changes, as in ?Static Test Double Generation?. Management of dynamic mocks is simpler than staticly defined mocks since the mock only exists at run-time and has no source file.

Building Hand-coded Test Doubles

When building ?Hand-coded Test Double? objects, we have a number of code reuse mechanisms available to reduce the effort of creating the mock class. Some ?Static Test Double Generation? tools also use these techniques.

The mock class may inherit useful utility functions for storing expected inputs or values to return from an ?Abstract Test Double?. This option is not available if you are creating a ?Subclassed Test Double? in a single-inheritance languages like Java and Smalltalk.

The mock class may reuse the utility functions by delegating them to a ?Test Double Helper Class?. This is especially common in languages that don't have inheritance (such as VB 6 or earlier) or when you are creating a ?Subclassed Test Double? in a single-inheritance language like Java.

Through support for mixins, languages like Ruby enable the definition of a variant of ?Test Double Helper Class? called a ?Test Double Helper Mixin?. Mixins allows utility functions to be added from a number of ?Test Double Helper Classs? to a ?Hand-coded Test Double? so that they may be referenced as if they were methods of the mock itself greatly simplfying test code.

The mock class may hold the expected results and values to return in instances of ?Expectation Classes?. These are special purpose collection classes that hold expected method calls (with all their arguments) and later compare them with the actual calls using assertions.cite MockObjects.com

Hard-Coded or Programable Test Double Behavior?

The mock component class may be either hard-coded or "programmable". When hand-coding a mock object for a specific test, the ?STTCPW? is to build a Hard-Coded Test Double with the specific values to be returned hard-coded. Indirect outputs are verified through assertions with hard-coded expected arguments inside each of the methods that could be called. The main disadvantage of a Hard-Coded Test Double is that it hides the expected results of the test in the mock object class making the test harder to understand. This is less of a problem when it is combined with ?Anonymous Inner Test Double? when programming in Java since the Test Double is then inside the test.

Generalizing the Hard-Coded Test Double to reuse it in other tests typically leads to creating a Programmable Test Double. After instantiating the mock object, the test goes through a "mock programming" phase (much like writing a script of the dialog expected to take place between the SUT and the mock) in which it tells the mock how to behave. This also makes the test easier to understand since the behaviour of the mock is visible inside the test.

Relationship of Test Double to the Real Depended-on Component

It goes without saying that the mock object needs to be able to stand in for the real depended-on component. But other than that, is there any other relationship between them? Depending on how the real depended-on component is defined and the features of the programming language, the answer may vary greatly.

?Test Double Subclass? -- The mock implementation class is created by subclassing the real depended-on component class and overriding the methods of interest. This can be a bit tricky because it is hard to know which methods need to be overriden in any particular test. And it would make for a very large mock class if every method had to be overriden with a mock implementation.In Java, final classes cause a major problem!
?Common Interface Test Double? -- The mock implementation class is created by implementing the same interface (or, in C++, subclassing a "pure virtual class") as the real depended-on component class.
?Coincidental Test Double? -- The mock implementation class is created by implementing the same methods as are on the real depended-on component class but there is no actual class hierarchy relationship between them. This approach is most commonly used in "dynamically typed" languages like Smalltalk and Visual Basic and in "untyped" languages such as scripting languages. For Smalltalk, [AdaptionSoft SmallMock(http://www.adaptionsoft.com/smallmock.html)] provides a framework for ?Dynamic Test Double Generation? of ?Coincidental Test Doubles?. In Visual Basic, ?Coincidental Test Doubles? can be used if you declare the type of the depended-on component to be a generic Object. This may be necessary if the type of the real depended-on component does not allow subclassing and there is no interface or superclass already defined.

Other Uses of Test Doubles

Endoscopic Testing

MacKinnon et al introduced the concept of "Endoscopic Testing" in their initial Mock Objects paper. Endo-testing involves testing the SUT from the inside by passing in an Active Mock Object as an argument to the method under test. This allows verification of certain internal behaviours of the SUT that may not be at all visible from the outside.

The classic example they describe is the use of a mock collection class pre-loaded with all the expected members of the collection. When the SUT tries to add an unexpected member, the mock collection's assertion fails. This allows the full stack trace of the internal call stack to be visible in the JUnit failure report. If your IDE supports breaking on specified exceptions, you can also inspect the local variables at the point of failure.

Speeding Up Fixture Setup

Another use of Test Doubles is to reduce the runtime cost of Clean Slate test fixture setup. When the SUT needs to interact with other objects that are difficult to create because they have many dependencies, a single Test Doubles can be created instead of the complex network of objects. When applied to networks of entity objects, this technique is called Entity Chain Snipping.

Speeding Up Test Execution

Another use of Test Doubles is to improve the speed of tests by replacing slow components with faster ones ?Stub Out Slow Component?. Replacing a relational database with an in-memory Dummy Object can reduce test execution times by an order of magnitude! The extra effort of coding the dummy database is more than offset by the reduced waiting time and the quality improvement due to the more timely feedback that comes from running the tests more frequently.

Cite or include material from XP2001 paper (and XP Perspectives chapter) on "Improving the Efficiency of Automated Tests".

Other Considerations

When the SUT delegates to several depended-on components, we may want to test that the methods on the depended-on components are called in the right order, not just within a single depended-on component, but also that they are interleaved properly.

To do this, the test should create an instance of a ?Test Double Helper Class? and pass it to each Active Mock Object as it is instantiated. Their methods should be implemented by delegating to the ?Shared Test Double Helper?. Since the assertions are done by the ?Shared Test Double Helper? which sees all the calls made to all the mocked depended-on components, this ensures that the calls are interleaved properly.(Note: If there is any chance that several depended-on components will have methods with the same signature, the mock helper should also be passed the name of the mock component that is delegating to it.)

Since most of our tests will involve replacing the real depended-on component with a mock object, how do we know that it works properly when the real depended-on component is used? Of course, we would expect our functional tests to verify behaviour with the real depended-on components in place (except, of course, when the real depended-on components are interfaces to other systems that need to be stubbed out during single-system testing.) We should have a unit test, a ?Stubbable Initialization Test? to verify that the real depended-on component is installed properly. The trigger for writing this test is the first test that replaces the depended-on component with a mock since that is often when the mockable depended-on component mechanism is first built.

Finally, we want to be careful that we don't fall into the "new hammer trap". ("When you have a new hammer, everything looks like a nail"). Overuse of mock objects or stubs can lead to ?Overspecified Software?.