Spotify's Backstage team adopted the Uffizzi Developer Platform for its ability to scale and reduce the merge time for hundreds of pull requests opened by their exploding community of contributors.
Hundreds of virtual cluster ephemeral environments that serve as temporary staging environments have empowered 200+ Backstage maintainers and contributors to improve the project's overall development velocity by 20%.
Backstage is a powerful framework for building developer portals that was created at Spotify in 2016. Open-sourced in early 2020, it has now been adopted by over 500 companies ranging from start-ups like Hopin, to tech giants like Netflix and LinkedIn and Fortune 500 companies like American Airlines and Siemens. Backstage has seen explosive growth as it helps teams simplify their ever-increasing service complexity— the project now boasts 20k+ GitHub Stars, nearly 1000 contributors, and 3500+ forks. Backstage is used by more than 1M+ end user developers across the globe.
Backstage's explosive growth has put tremendous pressure on maintainers who must review and accept hundreds of new git contributions every month. Backstage is one of the most active open-source projects in the world with an average 400 active pull requests and an average of 340 pull requests being merged in any given month.
To give an idea of its scale, over the course of a month they had 399 active pull requests and merged 336 of them (and this included a holiday period.)
To meet growing demand the Backstage team needed to pack more features into every release. So they turned to Ephemeral Environments to save time and improve the quality of the hundreds of contributions that are reviewed every month.
Note: Ephemeral environments are often synonymous with the terms preview environments or on-demand environments or dynamic environments. Ephemeral Environments are often used as a staging environment, QA environment, release environment, proof of concept environment, or demo environment and they are not typically used as a production environment. The opposite of an ephemeral or dynamic environment is a static environment which persists indefinitely.
Ephemeral environments are temporary environments that mimic a production environment in functionality - broadly they are used as staging environments. In the case of Backstage ephemeral environments, the service(s) that comprise the application (defined in docker-compose, helm, or kustomize) run on a Kubernetes virtual cluster in a multi-tenant fashion within a Kubernetes namespace. Ephemeral environments are created on a trigger (in this case a pull request) to test a specific feature branch before they are deleted by either a time-out (delete after Xhrs.) or another trigger (close pull request).
Kubernetes ephemeral environments have many use cases and are often used as temporary QA or staging environments. Ephemeral Environments are a functional clone of the production environment except that they represent code from a feature branch vice your main or default branch. Ephemeral environments enable teams to test in isolation compared to a traditional shared environment model where several branches often create conflicts that bring testing to a halt and cause release delays. Ephemeral environments can be used for production but this is rare given the need for data consistency in a production environment.
“We'd been wanting ephemeral preview environments for awhile - our first attempt at a solution did not work out and then when Uffizzi came into the picture it just made sense. Maintainers check all the PRs - mostly a visual check - and this saves us a lot of time. The temporary staging environment is just there. Really nice. We have several hundred contributions a month so it's quite a lot." -Ben Lambert, Sr. Developer on Spotify’s Backstage Team
Ephemeral Environments provide robust opportunity to save time for everyone involved in the project. Contributors have access to a live, virtual cluster environment where they can confirm functionality, and, in that same staging environment, maintainers can quickly “preview” the proposed changes. All of this happens before merging and without maintainers having to deconflict a shared staging environment that is in extremely high demand.
Ephemeral environments enable teams to meet capacity - literally the demand signal - for acceptance testing in real time. It's as many testing environments as you need for as long as you need those testing environments. When an environment is needed it's rapidly created - and when testing is completed, the ephemeral cluster environment (and all its components) is deleted.
Ephemeral environments are critical to making pull request reviews, feedback, and iterations all happen faster. If you take those time savings and apply them to an average of 400 active Github pull requests in any given month, the time-saving impact is transformational. All of this leads to a quicker time to merge and more functionality packed into each production-ready release!
With so many developers contributing to Backstage it would be cost prohibitive to have a static host clusterfor every contributor or contribution. If you had to manually create and delete these it would leave a mushroom farm of Kubernetes clusters that had been long abandoned after their period of usefulness. But Ephemeral Environments, as the name suggests, are "short-lived" unlike a persistent production environment. Ephemeral environments have a purpose-driven, trigger-based life-cycle. They are created to review and test a specific feature branch and then all environment components (virtual cluster, its control plan, containers, namespace, networking, ingress, dynamic URL. . .) are deleted. This is cost effective, particularly when you add in the development velocity advantages of testing in isolation and faster iterations that ephemeral virtual clusters provide.
For every pull request, Backstage needed a successful ephemeral environment workflow where contributors and maintainers could rapidly test the running code in the git feature branch. They needed a self-managing, automated ephemeral environment solution - no manual steps for reviewers or maintainers and something that had little maintenance overhead. And they needed something that worked in conjunction with their existing GitHub Actions workflows that would also support outside contributors who needed access to the ephemeral environments. Lastly, they needed something fast enough to be effective for rapid review.
An initial third party ephemeral environment solution was implemented, but failed to live up to its potential. The solution could not scale effectively with a limit on concurrent staging environments, provided a poor user experience - particularly for contributors, and had exploding costs. The build system was outside of the Backstage team’s control and the size of the images created storage and cost issues.
There were also technical limitations that limited the teams's ability to create ephemeral environments when they needed them. At any given time, Backstage needs 100+ ephemeral environments—one for every open pull request. Lastly, outside contributors had no access to the ephemeral environments or the associated logs to review their own contributions.
“Ephemeral environments are not an easy problem. It’s a lot more than spinning up some containers. You have to deal with the git branch builds, the build artifacts, a container registry, permissions, secrets, OAuth, and a dynamic life-cycle management that comes with all of it,” notes Lambert.
The Uffizzi team linked up with Backstage maintainers at KubeCon North America and started working with them to implement a solution that met all the project's goals for an ephemeral environment solution.
Out of the box Uffizzi met most of the requirements:
Meeting most requirements for ephemeral environments is never good enough - here’s the remaining requirements that we worked with Backstage maintainers to address:
Backstage needed the ability to securely support ephemeral environments for outside contributors. Backstage has around 1000 contributors with the vast majority of those coming outside the core Spotify team that runs the project. Without the ability to support the many outside contributions, the value that ephemeral environments provide would only be partially realized.
This was a security and access control challenge within both GitHub and Uffizzi.
For appropriate access control within GitHub, we created a two-stage GitHub Actions workflow that is purpose-designed for open-source projects that follow the common fork and pull model. The two stages prevent the possibility that secrets or other sensitive information can be accessed by an outside contributor while also allowing for Backstage workflow(s) to securely authenticate with Uffizzi via OpenID Connect.
Backstage’s repository has the two stages broken out into workflows `uffizzi-build.yaml` and `uffizzi-preview.yaml` that create ephemeral environments and manage the lifecycle of these temporary environments.
The first stage is initiated when a pull request is opened, closed, or receives a push to thegit branch—this stage handles the `build` and `push`. Image Build artifacts are pushed into Uffizzi's public ephemeral container registry and a dynamically generated docker-compose (*Update - an an upgrade from compose based previews to virtual cluster preview environments is ongoing) and other artifacts from this workflow run are bundled and passed to the second stage of the workflow.
The second stage is initiated when the first stage finishes and, as part of its sequence, calls the Uffizzi reusable workflow to manage the ephemeral environment (For Uffizzi Ephemeral Kubernetes environments the Virtual Cluster workflow is used).
When the second stage of the workflow runs, it passes an OIDC token (not accessible to outside contributors) to Uffizzi for secure authentication. As part of this step Uffizzi also recognizes the `github_actor` that initiated the `pull_request` and provides them with `read-only` collaborator access to the Uffizzi UI - this gives them access to the container logs and their environment status.
With the two-stage workflow and the Uffizzi security design that we implemented, Backstage now has ephemeral environments or a temporary staging environment for every pull request, including those from outside contributors.
Multiple image build artifacts for every pull request on a project that sees 400 active Github pull requests in a month can lead to an ever-growing registry storage requirement for build artifacts that are only needed temporarily.
To solve this problem and to generally lower friction for contributors and maintainers we set-up an ephemeral container registry where image build artifacts are temporarily stored and then deleted when they are no longer needed.
With these final requirements met, Uffizzi was initially integrated Dec 16th 2022 and was fully operational January 13th 2023. Since then, pull requests into Backstage have had an accompanying ephemeral environment. Uffizzi provides an average of 400+ ephemeral environments each month for Backstage to make pull request reviews faster and easier.
The Backstage team has iterated on the base ephemeral environment seeding to make the ephemeral environments more useful. Every ephemeral environment is now populated with a production-like configuration and dynamic GitHub OAuth login is supported in every ephemeral environment.
Over 450 engineers have benefitted from Uffizzi and Backstage's Development velocity has increased by 20%. Prior to Uffizzi the project averaged 35 significant merges per week and after they are averaging 42.
"Uffizzi saves us a ton of time accepting the hundreds of pull requests we receive every month," says Ben Lambert, Sr. Spotify Engineer and Backstage Maintainer.
There's a cascading range of benefits from ephemeral environments. Uffizzi's automated dynamic environments provide robust testing opportunities pre-merge. And since the implementation, Backstage maintainers and the project as a whole have benefitted from hundreds of ephemeral environments that act as a temporary staging environment and lead to faster iterations on proposed changes, faster pull request reviews, fewer bugs introduced, and huge time savings on the acceptance testing of 400 active pull requests every month.
Uffizzi supports several other popular open-source projects in the same way including NocoDB (33k GH stars), Forem by Dev.to (20k GH stars), and Lazygit (30k GH stars). If you’re interested in implementing a kubernetes-based internal developer platform for your organization you can set-up a proof of concept for free at Uffizzi Cloud.