March 24, 2023

min read

Ephemeral, Dedicated Environments for your Database

A case study of Uffizzi-built ephemeral environments for databases, to solve the problems arising from the “shared development database” approach. Uffizzi ephemeral database environments, built on Kubernetes, allow consistency, and resilience, save cost, scale easily, eliminate friction in testing, and can be provisioned and destroyed in a couple of seconds.

Shruti Chaturvedi

Open Source Integrations Engineer

‍

Why Uffizzi built an ephemeral environment solution for databases

After some months of analyzing development patterns across teams, we noted that most projects were following a shared database model for development and testing purposes. This is a troublesome paradigm because it does not play along with modern development practices like continuous integration and delivery. As features are being developed continuously, these must be tested continuously, too, in isolation, to make sure their impact is efficiently tracked. Projects sometimes have a couple of static dev/test environments (or perhaps even dedicated environments) for testing their application, but the problem is all these dev environments share the same database. All developers are using this database for testing and experimentation, defeating the purpose of testing in isolation. Read here about how this is an anti-pattern.

‍

At first glance, this might look like a lot simpler approach, especially because running and maintaining databases require significant cost and effort, and constantly dumping and restoring databases requires a non-trivial amount of time. What could possibly go wrong with having one or two instances of Postgres Server — filled with data from production — that all the developers can use to develop and test the next version of the application with “prod-like” data?

‍

The fact is, there are tons of hidden costs and developer performance penalties that come with adopting this approach over a dedicated, ephemeral database environment approach. Using a shared dev/test database model can work reasonably well for small teams, but once projects start to scale, this approach simply doesn’t work. Here are some of the problems of following a shared development database approach:

‍

Shared database environments are mostly outdated: syncing databases or applying patches takes several hours which blocks developers and stalls development efforts. The same applies to running migrations on the shared DB.
Achieving resiliency in development/test databases is expensive. The ability to quickly reset and/or spin up new databases in the case of a failure or error is onerous.
Lack of isolation causes inconsistencies. The developers are making changes to the same DB, changes of one developer affect other developers which creates a collision in the DB state.
Shared DB environments block developers from iterating continuously on features dependent upon the DB since any changes made to the shared DB will affect testing on other features.
Developers might not have the required access permissions. Since the shared environment is critical to operations, DBAs often restrict developers to the consumer roles of the DB. However, developers might need escalated privileges for development purposes. Managing speed and security parallelly is cumbersome.
Spawning new instances for early feedback from non-technical stakeholders is hard. To instantiate each new instance and configure it to meet operational or engineering requirements takes time.
Integration testing is fuzzy since the state of the DB the tests run against is inconsistent. The unpredictable state of the data makes testing invalid and inaccurate.
Dev environments are expensive, especially if multiple dev/staging environments are running continuously.

As mentioned earlier, these issues can be managed for small teams of 2–4 developers, but as projects start to grow, unwieldy dev and test shared database environments create bottlenecks for engineers who are always at risk of making changes that break the build and hence slows development, test and release cycles while increasing costs. Running development databases locally can alleviate some of these issues BUT it raises another set of problems, the most notable one being the lack of resources needed to run a DBMS locally.

‍

The lack of dedicated, ephemeral test environments for databases is an anti-pattern, that makes following agile practices like continuous integration, development, and testing unmanageable. These engineering challenges introduce the following operational challenges:

Time to Market increases.
Takes longer to get feedback.
Takes longer to iterate and ship.
Increases cost

To combat these rather common issues, faced by multiple development teams, Uffizzi built an ephemeral environment solution for databases.

‍

How is the ephemeral dev database model the better alternative to shared dev database model?

Let’s visit the idea of ephemeral development databases through an example: I have a team of 12 developers, 6 of which work on the BE of the application. As my team builds features for this application, I would want them to test the features continuously in their dedicated environments to make sure what is being tested is not impacted by their teammates’ code. I would also want them to test how this new feature in my application would interact with our BE database. For example, my business logic has a certain way of handling duplicate entries. Will this new feature break that?

‍

To run these integration tests between my application and the database it uses, I will need an isolated, dedicated ephemeral environment running my app with an isolated, fresh, ephemeral DB to guarantee that the DB my team is running tests against hasn’t been left in an unexpected state by previous changes. In a shared database model, chances are that the DB state a test relies on has been altered by changes made by another developer, which could lead to major flaws in the application.

‍

Through this one example, we can already understand how the ephemeral development database model is the better alternative to the shared/communal model.

‍

Testing our hypothesis through experimentation

We had a hypothesis that the shared development database model is an anti-pattern and ephemeral databases are the way to go. We then built a prototype to conclusively establish our hypothesis.

‍

Uffizzi built a Pull Request workflow prototype to run Postgres instances on ephemeral environments. These ephemeral environments for Postgres can instantiate a fresh Postgres environment for development and can be destroyed when no longer needed.

‍

Typically, developers will be running unit tests on their applications and integration tests with their databases. To test this development pattern, Uffizzi built the solution to run pgAdmin in an ephemeral environment, where a PR to pgAdmin (replace with your application here) will create an ephemeral environment on Cloud, orchestrated through Kubernetes, with pgAdmin installed. The ephemeral environment will be started with an ephemeral Postgres instance that is dedicated to this pgAdmin instance and will be isolated from other environments.

‍

This PR workflow allows teams to follow agile practices by integrating into existing CI/CD pipelines effortlessly. In this solution, the workflow is integrated into GitHub Actions— any PRs opened to the example repo will instantiate a new ephemeral environment with a fresh, ephemeral Postgres instance.

‍

You can follow these steps to configure ephemeral development database environments for your project. You can also checkout the PoC on GitHub.

‍

Step 1: Start with a git repo

For this prototype solution, we forked pgAdmin, an open-source GUI client for administering and monitoring Postgres. Here, pgAdmin acts as a client, consuming and making changes to Postgres. After we make a change to this client and open a new PR in the repo, a new ephemeral environment is spun-up with our client (pgAdmin), updated with the new logic, and a fresh instance of Postgres to develop and run tests against.

‍

Step 2: Containerizing your App/Client and DB dependencies

We then containerized pgAdmin, so it is self-contained and can be orchestrated easily. The pgAdmin container is fully configured with all its needed dependencies. This allows for effortless updates and patches. pgAdmin already comes with a Dockerfile, which we used to build our container.

‍

Step 3: Make the application ready for orchestration on Kubernetes

After containerizing the application, Uffizzi then uses the docker-compose.uffizzi.yml

file to define other containers (services) that the application is dependent on. This compose file is used to orchestrate the containers on Kubernetes. In our compose file, we defined two containers: Postgres and pgAdmin.

‍

In the compose file, we define what images should be used to build the container. We use an official Postgres image and build pgAdmin from source. As mentioned previously, this allows developers to test changes to their application, as each time the ephemeral environment will be built with the new code changes to the app, and a fresh Postgres instance to develop and run tests against. More on this in the next step.

‍

‍

Other notable components of the docker-compose.uffizzi.yml file are:

x-uffizzi: defines an entrypoint into the ephemeral environment. We exposed the service pgAdmin, on port 8081 on the ephemeral environment to receive traffic.
environment: we used this attribute to define the environment variables needed by pgAdmin and Postgres.
volumes: to seed our Postgres instances with random, “prod-like” data, we mount our database initialization script in the volumes attribute. This attribute can be used to load start-up scripts onto the database container, which will be executed for each ephemeral environment, so each ephemeral database comes seeded with data that meets business requirements.
> The uffizzi-db-script folder contains the seed.sql script, which will seed Postgres with test data. Depending upon your use case, you can also add multiple scripts in the folder you mount to initialize your database.
> Database initialization logic will be different for each project. For example, some projects might use snapshots or DB images for their dev/test DBs. We found using DB init scripts to be the most efficient because it’s faster and making your datasets safe—redacting PII and other sensitive data—is easier and cheaper with init scripts than with database snapshots/images.

‍

Step 4: Defining GitHub workflows to provision the ephemeral environments

‍

The power of ephemeral environments lies within the realms of CI/CD. Uffizzi leveraged GitHub actions to orchestrate building the ephemeral environments, collecting feedback, updating the environments as the feature gets iterated on, and finally deleting the environment if it is no longer required.

‍

We built a 2-stage GitHub workflow, to extend GHA-triggered ephemeral environments support for open-source projects. The 2 stages of this workflow are:

The uffizzi-build.yml action builds pgAdmin from source and pushes the image to an ephemeral registry
The uffizzi-preview.yml action creates the ephemeral environment with the container definitions in the docker-compose.uffizzi.yml file. This workflow will also update and delete the environment as the PR is updated or closed respectively.

Running database servers can easily hike resource consumption and cost. If multiple databases had to be created for dev/test purposes, that would escalate the issue. This is where ephemeral databases provide great value because they allow you to follow the pay-as-you-go model; the uffizzi-preview.yml action helps achieve this model by deleting the environment and freeing the resources as the PR is closed. Particularly significant for resource-intensive applications like running databases, running simulated hardware, etc. Read this blog to see how we built a prototype for running embedded applications with simulated hardware in an ephemeral environment.

‍

Step 5: Triggering ephemeral environment creation

After adding the above 2 GitHub workflows in our fork of pgAdmin (if you’re following this example for your project, make sure the actions are added to the default branch of your repo, else the 2-stage workflow will not work as desired.) we opened a feature PR in this fork. This triggered the uffizzi-build.yml, which then triggered the uffizzi-preview.yml workflow and created a new pgAdmin instance with an ephemeral Postgres instance, seeded with data from the database initialization script we mounted onto the container.

‍

As commits are made to PR, the pgAdmin instance will be refreshed, and persisting changes made to the Postgres instance between container restarts through the use of the volumes attribute in our compose. As the environment is refreshed with new commits, the state of Postgres would be the same as it was left last. This allows developers to continuously iterate on their features without constantly having to check the state of their database.

‍

Every new PR creates a new instance of the application you’re building from source along with other containers you’ve defined in docker-compose.uffizzi.yml. In this case, a new PR rebuilt pgAdmin and each environment was started with a fresh instance of Postgres. If you want to persist data between container restarts (triggered by making new commits to the PR) for any container, you can use the volumes directive as we did for Postgres.

‍

The environment will be destroyed right when the PR is closed. This is a huge time and cost-saver since it conserves the resources that would otherwise be used to manage unwieldy dev or test database environments.

‍

Inferring results: establishing the power of ephemeral DB environments

We ran Postgres locally, in a shared environment, and on ephemeral test environments and used pgAdmin as a client to Postgres, to administer and monitor our Postgres instances. Below, we are listing the prominent results of running Postgres on ephemeral environments and the value this added to development processes by providing fully-configured, consistent, isolated, cheap, and fast ephemeral database environments, as compared to a local or a shared development model.

Integrated into CI/CD workflow: a new ephemeral environment is spun for each PR and is updated with every change to the PR, thus integrating agile development practices into the SDLC.
Containerized ephemeral databases: the definitions of the environment are maintained centrally. It can be easily scaled, built, and maintained from the central definition. Rather than syncing or patching the development databases repeatedly, just the central definition — for example, a script to seed the DB and replicate “prod-like” behavior can be version controlled and used to instantiate environments. Fresh environments will be ready instantly. Easy to manage, fast, and consistent.
Resilient by design. If something breaks in an environment, destroy it and spin a new one again. Errors are localized and do not influence the central definition and/or other ephemeral database environments.
Discrete, isolated environments: ephemeral database environments ensure DB transactions and changes are tested in isolation. This makes sure each developer has access to a personal DB and that their changes don’t cause inconsistencies elsewhere.
Doesn’t block developers. Each developer has a personal, sandboxed database that can be used by developers to continuously iterate on and test DB-dependent features. This allows for continuous integration, testing, delivery, and feedback.
Elevating permissions does not create security and/or operation risks.
Instantiating new ephemeral database environments is easy as these come fully configured. All it takes is opening a PR to spin a fresh, prod-like database environment and the environment will be up within seconds. With this, get feedback from technical and non-technical stakeholders early on. Shortens release and feedback cycles.
Integration tests are not affected by inconsistencies or pollution in the environment.
Optimized for resource consumption and costs. The environments are spun and destroyed as and when they are needed, making sure they don’t run for longer and incur extra costs, reducing the risk of over or under-provisioning.
Supports multiple databases: MongoDB, Redis, Postgres, MySQL, etc. If a DB can be containerized, we can run it as an ephemeral database.

By running this prototype, we conclusively established that developing with ephemeral database environments solves all sorts of problems engendered by the communal DB model. These ephemeral database environments bring your database into modern development practices, which facilitate innovation, allow you reach to the market faster, and reduce costs.

‍

To read out more about Uffizzi and our ephemeral environment solutions,, check us out at Uffizzi cloud. If you’re also looking for cheap, isolated, and fast development/test environments, connect with our team and accelerate your development, testing, and release cycles.

Ephemeral, Dedicated Environments for your Database

Why Uffizzi built an ephemeral environment solution for databases

How is the ephemeral dev database model the better alternative to shared dev database model?

Testing our hypothesis through experimentation

Step 1: Start with a git repo

Step 2: Containerizing your App/Client and DB dependencies

Step 3: Make the application ready for orchestration on Kubernetes

Step 4: Defining GitHub workflows to provision the ephemeral environments

Step 5: Triggering ephemeral environment creation

Inferring results: establishing the power of ephemeral DB environments

Resource center

Test Environment Management

Secure Multi-tenant Kubernetes with Virtual Clusters

Kubernetes Virtual Clusters for Efficient Resource Utilization

Ephemeral, Dedicated Environments for your Database

Why Uffizzi built an ephemeral environment solution for databases

How is the ephemeral dev database model the better alternative to shared dev database model?

Testing our hypothesis through experimentation

Step 1: Start with a git repo

Step 2: Containerizing your App/Client and DB dependencies

Step 3: Make the application ready for orchestration on Kubernetes

Step 4: Defining GitHub workflows to provision the ephemeral environments

Step 5: Triggering ephemeral environment creation

Inferring results: establishing the power of ephemeral DB environments

Like this article?

Resource center

Test Environment Management

Secure Multi-tenant Kubernetes with Virtual Clusters

Kubernetes Virtual Clusters for Efficient Resource Utilization