July 5, 2023

min read

Testing Environment Concepts and Best Practices

Learn how to design and configure a testing environment to decrease development time and improve quality.

A testing environment consists of service and infrastructure components that aim to simulate the behavior of components operating in a production environment. When properly designed and configured, components in a testing environment can enable a development team to leverage a wide variety of pre-deployment validation processes, with the ultimate goal of decreasing development time while improving quality.

‍

This article focuses on the various design decisions for a system under test and on the goals of a testing environment. These considerations are paramount to ensuring that the desired value of the testing environment is realized by the entire development team.

‍

Key testing environment concepts

‍

The table below summarizes some of the many considerations to take into account when designing and leveraging a testing environment that are discussed in this article. This includes design and development principles that enable the greatest value to be realized from a testing environment.

‍

The many roles of a testing environment	Testing environments are used in most testing and validation scenarios, with the exceptions of unit testing and testing in production. Functional tests may require front-end deployments and only a few back-end services, whereas performance or disaster recovery testing might require a much more accurate representation of the production environment as a whole.
Design considerations for the system under test	The design and implementation decisions for the system under test have significant effects on the capabilities and characteristics of a testing environment. This includes how quickly the test environment may be created, manipulated, and torn down and what sorts of development processes are enabled or improved by its existence.
Persistent test environments	A persistent environment is intended to exist for an extended duration and is frequently intended to be very close to the layout and configuration of the production environment. These environments frequently have high maintenance costs (as do production environments) and do not easily enable rapid deployment and validation of new features.
Ephemeral test environments	An ephemeral test environment is created to be short-lived until a particular need is met. These environments are most commonly used to solicit rapid feedback from key stakeholders or to perform validation of new features before their integration into a release or main branch. Maintenance costs are low because their duration is short, and deployment is usually rapid due to their limited scope.
General process and recommendations for setting up and managing a testing environment	Define clear goals for how the testing environment is intended to be used and the value that it will deliver. Apply design and implementation decisions to components of the system that ensure that components can be leveraged within the testing environment to achieve the desired goals. Regularly measure and assess the testing environment against the desired goals, address challenges and shortcomings, and apply learnings to new features.

‍

Testing environment roles

‍

When designing a testing environment, it’s important to first consider which types of testing a team is looking to enable within the environment. The various testing types leverage testing environments differently and impose varying requirements.

‍

Unit testing

‍

This type of testing does not require the use of a testing environment. All dependencies should be mocked or stubbed out, allowing for the validation of a specific block of code.

‍

Functional integration testing

‍

This testing involves validation of the interaction among various components within a system. It varies widely in its scope and testing environment requirements.

‍

A feature with a front-end component driven by data from back-end services will require the front end to be deployed within the test environment and for the back-end services to be deployed to serve data. Databases with seeded data may or may not be required, depending on the goal of the test and the capabilities of the back-end services to mock out these dependencies. Each of the required components within the testing environment will require the appropriate infrastructure and configuration to ensure that the various components are able to route data to each other.

‍

User acceptance testing

‍

This type of testing can leverage testing environments by validating those components of a system that impact the result or experience that is to be delivered to the end user. Not all components need to be present or functional within a testing environment to meet the needs of user acceptance testing.

‍

For example, validating a new sign-in user experience may not need those services or components responsible for new-user creation but will need the front end to be available (to showcase the new experience) and the back-end components responsible for providing data and responses to the front end. The testing environment will need to exist in a specified state for long enough to allow for the user acceptance to be completed, but no longer. Once the testing is completed, the need for the environment no longer exists.

‍

Performance testing

‍

A testing environment for performance testing will need to be representative enough of a true production environment for the gathered performance metrics to be an appropriate representation of those that the team can expect to observe in production. This environment will need all components of the system that impact the specific metrics being measured and assessed.

‍

Testing in production

‍

Production testing won’t involve the use of a testing environment. By definition, this type of testing takes place against code that is running and available in a production environment.

‍

System under test design considerations

‍

There are several design considerations and implementation decisions for the system under test that will have a significant impact on the amount of value that may be gained from a testing environment.

‍

Containerized services

‍

All services should be containerized, if possible, and appropriate for the project: The availability of containerized services allows for more rapid deployment of those services relative to traditional services that are not containerized. The benefits associated with containerized services frequently outweigh the associated development costs, particularly as a team becomes proficient in their creation and usage over time.

‍

If a specific service is needed within a testing environment and that service is not containerized, the deployment of that service requires careful orchestration of the installation and configuration of not only the target service itself but also its dependencies. If that same service is containerized, its deployment requires instead only the tooling to automatically orchestrate the service and its required dependencies.

‍

Employing containerized services more directly enables the use of ephemeral environments (relative to a testing environment without them). The use of ephemeral environments will maximize the benefits of developing within feature branches and increase the overall quality of integration branches.

‍

Configurable front-ends

‍

All front-end services should be as configurable as possible so that they can be deployed and leveraged in as many different testing environment configurations as possible.

‍

Back-end service URLs should all be driven by configuration settings, which will allow for the front-end deployment to pull data from a persistent or ephemeral testing environment.

‍

Specific features or operating modes should be driven by local configuration settings if at all possible. A front end with this level of configuration will allow for more rapid validation of in-progress or under-test features than if the exposure of specific features or operation modes needs to be driven by some external data source (a user configuration or settings service, for example).

‍

Ability to mock all external services

‍

There are frequently multiple external or third-party services that a system leverages in a production environment. In testing environments, many of the external services that a system depends on may not be necessary for the goal of the environment. It’s also possible that the external service itself will not provide for easy (or cost-effective) usage within a testing environment.

‍

Mocking external services so that their responses can be simulated or bypassed completely will allow for the system under test within a testing environment to be deployed without having to worry about its impact on or interaction with those external services.

‍

Common examples of services of this type are monitoring, logging, and external user authentication services. A feature team rarely may not desire monitoring or alerting for a testing environment, so interactions with those services should be configurable or mockable. Logs may be desirable but should not be configurable so that they are able to not be colocated with production logs.

‍

The need for real user accounts or authentication sessions within the context of an external authentication provider (e.g., OKTA) will drastically increase the dependencies and maintenance costs of your testing within a testing environment. These services should be mocked out or bypassed unless their proper integration is a specific validation goal of the testing environment.

‍

Minimal (and fully understood) data requirements

‍

Most services require a set of data to start and operate in a desirable “base” state. This data frequently includes not only configuration settings but also data that is necessary for the completion of user flows through the system (such as image assets for a front end). It’s critically important that the data required for a service to both start and function in a desired mode of operation be fully understood, so deploying a service to a testing environment with a desired purpose can be done in an easily understood and straightforward manner.

‍

A common shortcoming of a persistent test environment is that the system accumulates a significant amount of data over time, while the knowledge of what data is truly required to start or deploy the service or system is lost or forgotten. This lack of knowledge leads to significant maintenance and deployment costs, as answers to the following questions become more costly to answer:

‍

“We need to redeploy service X in a new cluster; what data do we need to get it up and running?”
“The data within database table Y was corrupted; what is the impact on the system and how do we return to normal operation?”
“We want to spin up a small test environment with services A and B only to validate feature C; what data do we need to get these services going?”

‍

An exercise to measure a team’s knowledge of the data requirements of a system is to attempt to stand up an entirely new testing environment. If ephemeral environments are regularly used, this exercise will be trivial. However, if only persistent environments have been used without documented data requirements, this exercise could prove very challenging.

‍

It’s also critical that knowledge about data requirements not be centralized with a single or small group of individuals. This can be achieved in a variety of methods: persistent Google docs, readme files, self-documenting coding styles, or even by rotating the responsibilities for test data and testing environment maintenance.

‍

Persistent test environments

‍

Persistent test environments are intended to exist and be maintained indefinitely. They often have nearly all services deployed within them, so the entire system may be validated, ranging from front-end user experiences to back-end data flows and processing. Data within these test environments is often semi-representative of data in production, as it is intended to appear in a similar format/structure as real data in production. However, it is often curated or created via processes that are not always identical to real users or services creating that data in a production environment.

‍

Persistent test environments are frequently shared test environments accessed by multiple cross-disciplinary team members throughout the software development process. As such, one of the most significant constraints of a persistent test environment is the number of services, components, or features that can be under test at any given moment.

‍

As a practical example, consider a scenario in which multiple components within a system are under development. Validation of one of these components will usually include regression testing to ensure that the changes operate as expected within a configuration of the system that is representative of the eventual production configuration. This may require all other interacting components within the persistent testing environment to be locked until the regression testing is complete, increasing the waiting time required before new changes to those dependent services might be validated themselves.

‍

These bottleneck and coordination challenges are not unsolvable within a persistent test environment, but dealing with them will require added infrastructure and maintenance costs as well as close coordination of and communication regarding the current state of the shared test environment. An example might be multiple versions of a service running within the persistent testing environment, with appropriate routing configuration ensuring that services are able to talk to the desired versions of their dependent services or components.

‍

Ephemeral test environments

‍

Ephemeral testing environments are ones intended to live for only a short amount of time: until their specific goals have been achieved. They are frequently used by development teams to validate features under development before the approval and eventual integration of those features into a target integration or release branch.

‍

Multiple team members are able to access ephemeral environments as they would persistent test environments, but access points and methods are unique to an individual ephemeral test environment. Examples include unique URLs to access the front ends deployed within an ephemeral test environment or a common URL domain but with a unique port that will allow access to a unique set of services deployed within an ephemeral testing environment.

‍

Persistent-environment_vs_ephemeral-environment

‍

A core concept of an ephemeral test environment is that only the bare number of services and minimal infrastructure necessary should be deployed and configured to allow for the specific goal of the environment. For example, a feature improvement to a web user creation flow would make use of an ephemeral environment with the web front end and the back-end services associated with user creation. Back-end services dedicated to processing monthly subscriptions, for example, would not need to be deployed within the ephemeral test environment if the system architecture allows for user creation flows to operate independently from subscription processing.

‍

The need for only specific components within an ephemeral test environment represents a significant advantage of ephemeral test environments over persistent test environments. Ephemeral test environments can be stood up more quickly and have significantly fewer moving pieces and constraints than persistent test environments.

‍

A singular ephemeral environment deliberately does not attempt to enable the concurrent development of similar components and services. Only the versions of components under test are deployed (or the deployed versions of their dependencies). If there are multiple options or flavors of a feature under development, those options would be deployed to unique ephemeral environments so that they may each be accessed in parallel without any impact or dependency on each other.

‍

ephemeral-environments-parallel-pull-requests

‍

If an unrelated feature under development were to require the version of a sign-in service currently deployed in production, that version of the desired service would simply be deployed into its ephemeral test environment. That environment would operate in complete isolation from any other ephemeral environments, showcasing iterations to the sign-in components currently under development.

‍

Setting up and managing a testing environment

‍

Follow these process steps and recommendations to ensure success in creating and maintaining a test environment.

‍

Clearly define the goals and intentions of the testing environment

‍

What sort of testing types are to be enabled? Are all services needed within the testing environment? What sort of data requirements will be needed to fulfill the testing goals?

‍

Identify if one or more multiple testing environments are needed and how each should be created, used, and maintained

‍

It’s not necessary to have a single testing environment to fulfill all goals. Ephemeral testing environments with only necessary services can be created for the validation of feature branches; larger numbers of services could be deployed within a testing environment more representative of production to enable performance testing.

‍

Be sure that it is understood how services would be deployed and configured within environments, how updates and maintenance would be performed on the environment (if necessary), and how necessary data would be curated for use within the testing environment.

‍

Identify what system design or architectural decisions are needed to enable the desired testing environments

‍

Do any changes need to be made to an individual service or multiple services to enable their ability to be deployed and exercised within the desired testing environment? Are all necessary services containerized for rapid deployment and configuration? Are all front-end components configurable so that they can be directed to interact with the back-end services deployed within a testing environment? Implementing these changes will take time up front but will provide returns in terms of value gained from being able to perform testing within the desired testing environment.

‍

Leverage external vendors and resources wherever feasible

‍

External vendors and services may have costs associated with them, but their use will offload work that would otherwise need to be performed by the team, saving time and money in the long run compared to doing the internally. Time spent deploying, configuring, and maintaining services or data sets within a persistent testing environment could be spent validating features within an ephemeral environment that is automatically deployed and configured.

‍

Decommission test environments that are no longer necessary

‍

Test environments require maintenance over time, just like production environments do. The more test environments that are kept up and running, the greater the cost and effort that is required to ensure that those environments are running, accessible, and in a state that allows for their intended purposes to be fulfilled.

‍

When the goals of a testing environment have been satisfied, seriously consider tearing down that environment. If a feature branch has been merged, the ephemeral environment that was deployed to validate it should be destroyed. Destroying an environment is not always the best decision after a goal is complete, but it should be considered.

‍

Regularly assess the value and challenges associated with your testing environments

‍

Just as with all software development processes, challenges and shortcomings should be regularly reflected on, and the team should decide on what changes might be considered to alleviate these challenges.

‍

Perhaps a persistent test environment has been in place for a long time but is greatly hindering or even preventing the ability to perform integration against feature branches. Experimenting with ephemeral environments might be worth trying.

‍

It could be that the available testing environments are all bottlenecked by a dependency on the presence of a functional sign-in back-end service. Development work to enable dependent services to mock out this back-end service might greatly increase the scope or performance of testing scenarios that may be performed within these testing environments.

‍

Or, perhaps, testing environments have historically been managed by a single team or person. Are any services available that might offload some of this work and enable the self-service of developers and teams to configure and construct their own testing environments on demand?

‍

Conclusion

‍

There are a wide variety of testing environments that are leveraged across the tech industry. Some clearly fall into specific categories, while others may be hybrids that possess characteristics of various types.

‍

For maximum value to be gained from the use of testing environments, ensure that the usage of a testing environment is deliberate and that the goals of using that environment are enumerated. Determine the various testing environment options and system design and implementation requirements that would best enable the desired goals to be met within a desired testing environment. Regularly assess the ability of in-use testing environments to meet the team’s goals. Finally, don’t forget that the best approach may be leveraging multiple testing environments operating in parallel as opposed to a single shared, persistent environment.

‍

Testing Environment Concepts and Best Practices

Key testing environment concepts