Updated February 21, 2023
11 min read

Ephemeral Environments:
How Spotify's Backstage Team Packs 20% More Into Every Release

Spotify's Backstage team adopted the Uffizzi Developer Platform for its ability to scale and reduce the merge time for hundreds of pull requests opened by their exploding community of contributors.

Ephemeral Environment Kubernetes Results

Backstage's Merges per Week increased by 20% over a two month period following implementation. Raw data here provided by CNCF DevStats.

Introduction to Backstage's Kubernetes Ephemeral Environments

Hundreds of virtual cluster ephemeral environments that serve as temporary staging environments have empowered 200+ Backstage maintainers and contributors to improve the project's overall development velocity by 20%.

Backstage is a powerful framework for building developer portals that was created at Spotify in 2016. Open-sourced in early 2020, it has now been adopted by over 500 companies ranging from start-ups like Hopin, to tech giants like Netflix and LinkedIn and Fortune 500 companies like American Airlines and Siemens.  Backstage has seen explosive growth as it helps teams simplify their ever-increasing service complexity— the project now boasts 20k+ GitHub Stars, nearly 1000 contributors, and 3500+ forks.  Backstage is used by more than 1M+ end user developers across the globe.  

The Need:  Better DevX and Improved Efficiency to meet growing demand for new features

Backstage's explosive growth has put tremendous pressure on maintainers who must review and accept hundreds of new git contributions every month.  Backstage is one of the most active open-source projects in the world with an average 400 active pull requests and an average of 340 pull requests being merged in any given month.

To give an idea of its scale, over the course of a month they had 399 active pull requests and merged 336 of them (and this included a holiday period.)  

Active pull requests
Figure 1: Snapshot of Backstage repo metrics over a month period from Dec27 2022 - Jan27 2023 after ephemeral environments were integrated for acceptance testing ahead of pushing to the production environment

Using a Developer Platform for Ephemeral Environments (aka Preview Environments) to Gain Efficiencies in Acceptance Testing

To meet growing demand the Backstage team needed to pack more features into every release. So they turned to Ephemeral Environments to save time and improve the quality of the hundreds of contributions that are reviewed every month.

Note: Ephemeral environments are often synonymous with the terms preview environments or on-demand environments or dynamic environments. Ephemeral Environments are often used as a staging environment, QA environment, release environment, proof of concept environment, or demo environment and they are not typically used as a production environment. The opposite of an ephemeral or dynamic environment is a static environment which persists indefinitely.

What are Ephemeral Environments?

Ephemeral environments are temporary environments that mimic a production environment in functionality - broadly they are used as staging environments. In the case of Backstage ephemeral environments, the service(s) that comprise the application (defined in docker-compose, helm, or kustomize) run on a Kubernetes virtual cluster in a multi-tenant fashion within a Kubernetes namespace. Ephemeral environments are created on a trigger (in this case a pull request) to test a specific feature branch before they are deleted by either a time-out (delete after Xhrs.) or another trigger (close pull request).

Kubernetes ephemeral environments have many use cases and are often used as temporary QA or staging environments. Ephemeral Environments are a functional clone of the production environment except that they represent code from a feature branch vice your main or default branch. Ephemeral environments enable teams to test in isolation compared to a traditional shared environment model where several branches often create conflicts that bring testing to a halt and cause release delays. Ephemeral environments can be used for production but this is rare given the need for data consistency in a production environment.

“We'd been wanting ephemeral preview environments for awhile - our first attempt at a solution did not work out and then when Uffizzi came into the picture it just made sense.  Maintainers check all the PRs - mostly a visual check - and this saves us a lot of time.  The temporary staging environment is just there.  Really nice.  We have several hundred contributions a month so it's quite a lot."  -Ben Lambert, Sr. Developer on Spotify’s Backstage Team

What are the Benefits of Ephemeral Environments?

How Ephemeral Environments Save Time for Contributors and Team Leaders

Ephemeral Environments provide robust opportunity to save time for everyone involved in the project.  Contributors have access to a live, virtual cluster environment where they can confirm functionality, and, in that  same staging environment, maintainers can quickly “preview” the proposed changes.  All of this happens before merging and without maintainers having to deconflict a shared staging environment that is in extremely high demand.

Ephemeral environments enable teams to meet capacity - literally the demand signal - for acceptance testing in real time. It's as many testing environments as you need for as long as you need those testing environments. When an environment is needed it's rapidly created - and when testing is completed, the ephemeral cluster environment (and all its components) is deleted.

Ephemeral environments are critical to making pull request reviews, feedback, and iterations all happen faster.  If you take those time savings and apply them to an average of 400 active Github pull requests in any given month, the time-saving impact is transformational.  All of this leads to a quicker time to merge and more functionality packed into each production-ready release!

How Ephemeral Environments Save Money for Software Producing Organizations

With so many developers contributing to Backstage it would be cost prohibitive to have a static host clusterfor every contributor or contribution. If you had to manually create and delete these it would leave a mushroom farm of Kubernetes clusters that had been long abandoned after their period of usefulness. But Ephemeral Environments, as the name suggests, are "short-lived" unlike a persistent production environment. Ephemeral environments have a purpose-driven, trigger-based life-cycle. They are created to review and test a specific feature branch and then all environment components (virtual cluster, its control plan, containers, namespace, networking, ingress, dynamic URL. . .) are deleted. This is cost effective, particularly when you add in the development velocity advantages of testing in isolation and faster iterations that ephemeral virtual clusters provide.

A slow start with implementing Ephemeral Environments

For every pull request, Backstage needed a successful ephemeral environment workflow where contributors and maintainers could rapidly test the running code in the git feature branch.  They needed a self-managing, automated ephemeral environment solution - no manual steps for reviewers or maintainers and something that had little maintenance overhead.  And they needed something that worked in conjunction with their existing GitHub Actions workflows that would also support outside contributors who needed access to the ephemeral environments. Lastly, they needed something fast enough to be effective for rapid review.

An initial third party ephemeral environment solution was implemented, but failed to live up to its potential.  The solution could not scale effectively with a limit on concurrent staging environments, provided a poor user experience - particularly for contributors, and had exploding costs.  The build system was outside of the Backstage team’s control and the size of the images created storage and cost issues.

There were also technical limitations that limited the teams's ability to create ephemeral environments when they needed them.  At any given time, Backstage needs 100+ ephemeral environments—one for every open pull request.  Lastly, outside contributors had no access to the ephemeral environments or the associated logs to review their own contributions.

“Ephemeral environments are not an easy problem.  It’s a lot more than spinning up some containers.  You have to deal with the git branch builds, the build artifacts, a container registry, permissions, secrets, OAuth, and a dynamic life-cycle management that comes with all of it,” notes Lambert.

Getting closer to an Ephemeral Environments Solution for all

The Uffizzi team linked up with Backstage maintainers at KubeCon North America and started working with them to implement a solution that met all the project's goals for an ephemeral environment solution.  

Out of the box Uffizzi met most of the requirements:

  • Continuous Integration/Continuous Deployment Integration. Uffizzi works as a job within GitHub Actions so the Backstage team has optimal control and visibility over the CI steps—in this case with Github Actions.
  • Fully Automated.  Uffizzi + Preview Environment Action Workflow (or alternatively Virtual Cluster Action) handles creating, updating, deleting environments so there’s no manual steps required—everything happens in the background to generate a successful ephemeral environment workflow for every pull request.
  • Isolated Environments. Each feature branch can be tested in it's own isolated environment without dealing with potential negative impacts from other branches
  • Shareability.  Uffizzi posts an ephemeral environment URL as a comment in every pull request so contributors and maintainers can easily access the same environment.
  • Scalability. Built on Kubernetes, Uffizzi can easily handle the scale of a project the size of Backstage with 130+ ephemeral environments running concurrently on any given day

The Final Push for Ephemeral Environments

Meeting most requirements for ephemeral environments is never good enough - here’s the remaining requirements that we worked with Backstage maintainers to address:

1. Access to Ephemeral Environments for outside contributors

Backstage needed the ability to securely support ephemeral environments for outside contributors.  Backstage has around 1000 contributors with the vast majority of those coming outside the core Spotify team that runs the project.  Without the ability to support the many outside contributions, the value that ephemeral environments provide would only be partially realized.

This was a security and access control challenge within both GitHub and Uffizzi.

For appropriate access control within GitHub, we created a two-stage GitHub Actions workflow that is purpose-designed for open-source projects that follow the common fork and pull model.  The two stages prevent the possibility that secrets or other sensitive information can be accessed by an outside contributor while also allowing for Backstage workflow(s) to securely authenticate with Uffizzi via OpenID Connect.

Backstage’s repository has the two stages broken out into workflows `uffizzi-build.yaml` and `uffizzi-preview.yaml` that create ephemeral environments and manage the lifecycle of these temporary environments.

The first stage is initiated when a pull request is opened, closed, or receives a push to thegit branch—this stage handles the `build` and `push`.  Image Build artifacts are pushed into Uffizzi's public ephemeral container registry and a dynamically generated docker-compose (*Update - an an upgrade from compose based previews to virtual cluster preview environments is ongoing) and other artifacts from this workflow run are bundled and passed to the second stage of the workflow.

Figure 2: Snapshot of an example Preview (build) for ephemeral environment workflow

The second stage is initiated when the first stage finishes and, as part of its sequence, calls the Uffizzi reusable workflow to manage the ephemeral environment (For Uffizzi Ephemeral Kubernetes environments the Virtual Cluster workflow is used).

Example preview build step
Figure 3: Ephemeral Environment Creation of a Staging Environment: Snapshot of an example second stage Preview (deploy) workflow

When the second stage of the workflow runs, it passes an OIDC token (not accessible to outside contributors) to Uffizzi for secure authentication.  As part of this step Uffizzi also recognizes the `github_actor` that initiated the `pull_request` and provides them with `read-only` collaborator access to the Uffizzi UI - this gives them access to the container logs and their environment status.

With the two-stage workflow and the Uffizzi security design that we implemented, Backstage now has ephemeral environments or a temporary staging environment for every pull request, including those from outside contributors.

2. Build artifact life cycle management - what happens with all those images

Multiple image build artifacts for every pull request on a project that sees 400 active Github pull requests in a month can lead to an ever-growing registry storage requirement for build artifacts that are only needed temporarily.  

To solve this problem and to generally lower friction for contributors and maintainers we set-up an ephemeral container registry where image build artifacts are temporarily stored and then deleted when they are no longer needed.

Complete solution - All Ephemeral Environment requirements met

With these final requirements met, Uffizzi was initially integrated Dec 16th 2022 and was fully operational January 13th 2023.  Since then, pull requests into Backstage have had an accompanying ephemeral environment.  Uffizzi provides an average of 400+ ephemeral environments each month for Backstage to make pull request reviews faster and easier.

Ephemeral environment link
Figure 4: Snapshot of a Uffizzi Ephemeral Environment being posted as a comment in a pull request. This is a new environment vice a new deployment to an existing environment.

Figure 5: Snapshot of Active Backstage Ephemeral Environments in the Uffizzi UI.  Backstage averages 130+ active ephemeral environments at any point in time.

The Backstage team has iterated on the base ephemeral environment seeding to make the ephemeral environments more useful.  Every ephemeral environment is now populated with a production-like configuration and dynamic GitHub OAuth login is supported in every ephemeral environment.

What are the results - how's it been going for the Backstage team with ephemeral environments?

Over 450 engineers have benefitted from Uffizzi and Backstage's Development velocity has increased by 20%. Prior to Uffizzi the project averaged 35 significant merges per week and after they are averaging 42.

"Uffizzi saves us a ton of time accepting the hundreds of pull requests we receive every month," says Ben Lambert, Sr. Spotify Engineer and Backstage Maintainer.

There's a cascading range of benefits from ephemeral environments. Uffizzi's automated dynamic environments provide robust testing opportunities pre-merge. And since the implementation, Backstage maintainers and the project as a whole have benefitted from hundreds of ephemeral environments that act as a temporary staging environment and lead to faster iterations on proposed changes, faster pull request reviews, fewer bugs introduced, and huge time savings on the acceptance testing of 400 active pull requests every month.

Uffizzi supports several other popular open-source projects in the same way including NocoDB (33k GH stars), Forem by Dev.to (20k GH stars), and Lazygit (30k GH stars). If you’re interested in implementing a kubernetes-based internal developer platform for your organization you can set-up a proof of concept for free at Uffizzi Cloud.