Building a Comprehensive Functional Testing Strategy

Software testing is important. Unit tests are important. However, there are so many different layers and scopes of testing software in an enterprise that many companies miss some great opportunities to not only make their testing efforts more comprehensive, but also how to better leverage the work done. Enterprises mostly struggle with inconsistencies between teams in what constitutes “done” in terms of testing which then requires additional layers of management to reconcile what constitutes readiness regarding deployments.

Let’s Start Local

We are going to be building and deploying a “simple” REST API. The API is going to do a couple of things. It’s going to expose an OpenAPI compliant REST API interface for clients to be able to interact with the associated data. It’s going to persist data into a relational database (PostgreSQL). It’s also going to produce change events in a canonical data structure out to a Kafka topic using Debezium. Here’s a basic look at the individual components and the flow of data.

Example data flow for an HTTP Post to the “simple” REST API

We need to start with some defining vocabulary. The first one is to define the types of tests that might be performed. The easiest one to start with is the unit test. The unit test is traditionally defined by it’s scope where each tests perform validations on a singular unit of application code. Some define the test scope for unit tests as always being confined to a single method in a single class. Others have a more liberal application of the term and may define unit tests as any test that doesn’t hit the network (mocked databases, mocked external API calls, etc) but might test multiple classes and their functionality together. Some developers differentiate these multiple unit tests as functional tests rather than unit tests.

Is this beginning to bore you? Good. The truth is the names don’t matter. When it comes to testing, it’s all about the process and what capabilities the developer has in their local environment. It’s about defining the transitional points in the development process where the tests are executed and for each of these points, the testing scope to be performed. The first transitional point in the development process is when the developer is “dev complete” and they are ready to validate their local branch in their local development environment in order to commit and push the branch.

Developer’s local environments vary widely depending on the company and the project in regards to their capabilities. Some developers have the latest MacBook Pros with 32 GB of memory with the ability to run containers locally. These developers can work on a project where they can mimic a large portion of the environment right on their laptop. While others have a Windows box from 6 years ago with 4 GB of RAM and an enterprise security team that blocks anyone from running a container on their laptop. The difference in terms of the scope of local testing between these two developer environments is dramatic.

However hardware is only a part of the equation. The application’s characteristics has a lot to do with the process as well. Much like our example, some developers have the luxury of working on applications designed as microservices using some of the latest and greatest technology available. But others may be working on a legacy monolith requiring a huge application server which connects to a monster legacy database with hundreds of tables and no real good solution on how to run a local copy of it. Again, these variations are going to dictate a bunch of the process and strategy around testing.

Now Let’s Think Global

Application testing rarely stops at the application itself. Our example application has a fairly clearly defined local scope of testing. We will want to test all the API endpoints and ensure that they all produce the correct canonical events in the Kafka topic. However, the above graphic is a gross oversimplification. The real world runtime environment is much more complicated.

Example deployment for our “simple” REST API

The testing scope for this application just grew quite a bit. Of course, we can hope our organization’s platform engineering team is making all of these supporting service integrations seamless for the development team. However, these big complicated deployment diagrams can sometimes get us to lose focus on testing goals. To get laser focused, we need to go back to simplification around what we are trying to accomplish. We are trying to push changes to production. We want validation that the changes are fixing what we wanted to fix, adding the functionality we wanted to add and that we didn’t break anything else along the way.

Project Setup and Structures

There are going to be two major scopes of functional testing: local testing and acceptance testing. Our local testing is going to handle all the low level details related to the application and will test the individual classes, the functional parts, and might even include a version of the API and integration tests if we are one of those lucky developers with the right hardware and applications. The local tests should be thought about as a scope of testing to be performed not only in our local development environment, but also as part of the CICD pipeline. Local testing will be a part of the core project itself. To use a Maven example, these will be all the tests associated with running “mvn test” and “mvn verify” where the verify goal executes any of the integration tests. When a developer thinks they are ready to push their changes and create a pull request, the last step should be to run the local tests in their entirety and validate the results.

A Few Words About Local Test Organization

If you are building simple microservices with limited functionality, the run time of the tests is rarely an issue. But if you are working in a complex application, these test runs may become burdensome to productivity as the time it takes to complete the test suite may reduce the developer’s desire to run them often. If this becomes an issue, then you can always break the test suite into the fast and slow tests. Examples of fast tests are unit tests with everything mocked. A slow test might be an integration test that spins up 5 test containers as part of the initialization process. If the team starts to see local test failures in the CICD pipeline, it’s usually a result of a developer’s reluctance to execute the tests locally.

Code coverage is also a term and metric that draws the ire of developers. Many times development managers and technical leaders will overemphasize the metric as a validation as to the value of the tests as far as validating the application. While the overemphasis is a mistake, the metric itself can be helpful when it’s used in the correct context. When building a REST API, a high code coverage metric across all the service and repository layer classes may not be valuable. But a very high metric (100% in most cases) in coverage across the classes representing the API endpoints would be a very valuable metric to obtain and maintain, as long as the tests themselves were valuable to the validation process.

Building an Acceptance Test Project

The traditional unit and integration test frameworks are fantastic for application validation but they don’t work very well outside the traditional build tool’s lifecycle. As we move away from local testing and on to acceptance testing, we will want to test from the outside as if we are a real API client using the application in real-life scenarios with all the infrastructure associated.

For acceptance tests, we want to focus on the external effects of the application. Something more akin to black box testing where we ignore the internal implementation details. For example, one test may be focused on the API’s HTTP POST functionality which will create a new item for us. The test will want to act exactly like a REST client application would. The test will go get a auth token from the appropriate SSO instance and manage it’s refresh lifecycle. It will generate test data which should be consistent with the test data associated with the target environment and will submit the request to a URL representing the API Gateway, not the application itself. Again, these tests are meant to act exactly as we would expect a production client to act. We are also going to want to test the effects of the test. Continuing the POST example, we will also want to use the applications GET functionality to retrieve the data we just generated and validate it against expected values.

As part of the process, we will also want to validate the creation of the canonical event on the Kafka topic. This is where it may get complicated because a real world production API client might not have access to the topic, especially if we are attempting to mimic an external API client from a customer. If these cases, we might be able to create some reasonable facsimiles for validation, such as providing a separate administrator-level API requiring heightened credentials to gain access to some access point to validate the event. Perhaps the API endpoint is just providing confirmation that an event was created and possibly a key value to check against. The key here is to cover as much as possible within reason. If the local tests, which validate the creation of the event, are robust enough then the API calls themselves may be enough.

The acceptance test doesn’t care about the internals. It just cares that it works.

There are another subset of acceptance tests available to the organization which can be very valuable: production tests. These are tests that generally consist of read-only activities and can be executed against the production environment. The execution of the acceptance test suite for the application should be aware of what environment it’s running against and run only the non-mutating tests if the environment is production. These tests can be valuable in both validating newly released versions of an application, but also can be executed at different times throughout the day to keep an eye on things in periods of low traffic. Obviously, if you are very cost conscience, then continuously running production tests throughout the day may negate gains from serverless components.

Don’t be afraid to go low level on these acceptance test projects as well. Too many times I see organizations get too enthralled in buying expensive API testing products with the hope that their non-technical QA staff will be able to create and maintain the tests. What ends up happening is that the organization forgets that these non-technical focused people also don’t have the same prioritizations around code reuse or variabilization of functionality. They also might not understand the use of source control or how to keep the process stable. You can literally build using the traditional testing frameworks developers use for their local tests and just put all the tests in the main source path, rather than the test source path, and execute the tests using the command line. This also will be a great opportunity for those non-technical testing folks to get started in development while doing their day job.

Extra bonus points to those who can make their acceptance tests executable in the developer’s local environment as well. Then the pre-push developer branch could theoretically be validated even prior to the CICD process.

The CICD Pipeline

While there are plenty of companies out there doing actual continuous deployment, a large majority of us live the in world where this is downright impossible due to organizational limitations (mostly non-technical). However, we can still do some pretty good exercises in validation just deploying to a development environment. The CICD pipeline is going to want to validate the application itself, but can leverage the deployment automation as well to deploy a new version of the application and in most cases, execute the full acceptance testing suite as well.

An example CICD pipeline executing both local and acceptance tests

The cool thing about this flow is that we are not only testing the application, but we are testing that all the automations around deployment work as expected. Remember, the acceptance test is not that the application works. The acceptance test is that the application works within the context of the overall enterprise environment. When you combine the acceptance tests with the CICD pipeline and the existing deployment automations, the process becomes pretty bulletproof in identifying issues prior to production deployment.

Summary

Testing is a very important part to building good software. Your testing strategy should not be myopic in it’s focus. It should take into account not only the application’s intended functionality, but should encompass the entire environment.