The Staging Environment Best Practices Better.com’s Engineering Team Swears By

The overarching best practice for maintaining staging environments? Treat it as you would a real production environment. It’s the case for many engineering teams Built In has interviewed across the country, and, for Better.com’s software engineering in test team in Los Angeles, it’s no different.

Written by Colin Hanner
Published on Feb. 18, 2021
The Staging Environment Best Practices Better.com’s Engineering Team Swears By
Brand Studio Logo
staging environment best practices
shutterstock

The overarching best practice for maintaining staging environments? Treat it as you would a real production environment. 

It’s the case for many engineering teams Built In has interviewed across the country, and, for Better.com’s software engineering in test team in Los Angeles, it’s no different.

“[Near-parity with production] allows us to interact with and test in a production-like environment without disturbing our sales and operations workflows,” Jess Oh, a software engineer in test at the online mortgage company, said.  

Below, Built In LA interviewed Oh and her fellow team member, Kyle Hollenbeck, on the best practices and processes Better.com’s test engineering team uses in staging environments and how they get ahead of common staging environment pitfalls before they happen. 

 

What’s a critical best practice your team follows when developing staging environments, and why?

Software Engineer in Test Jess Oh: At Better, one of the critical best practices we follow is to maintain near-parity with production. We use scripts (built and maintained by our data team) to copy data from our production Postgres instance to our staging environment using our internal service. This allows us to interact with and test in a production-like environment without disturbing our sales and operations workflows.

Another benefit to Better’s staging environment is our standard for running smoke tests against staging as part of the deployment process. This step is required before deploying to production, so if deployment will cause environment-breaking issues in prod, it will be caught here first.
 

The bottom line here is that we monitor staging and production similarly.”


What processes does your team have in place for monitoring and/or maintaining your staging environment? Why are these processes so important?

Oh: Our environments are maintained with container orchestration, which sets up and exposes endpoints for monitoring from the deployed services. The services themselves also emit metrics, using a standard in-house developed metric library. Datadog checks serve as our primary alerting mechanism. Each monitor typically tracks a specific metric and can notify various channels, including Slack or our pager system. Staging is monitored similarly to production using endpoint and metrics alerting.

We use Terraform to manage these monitors for all environments. For monitors, we’re primarily concerned with writing Datadog queries to correctly monitor services. We have standard conventions for naming and alerting.

The bottom line here is that we monitor staging and production similarly.

 

What’s a common mistake engineering teams make when it comes to staging environments? And what should they to do avoid it?

Software Engineer in Test Kyle Hollenbeck: Treating staging environments like a sandbox is an easy trap to fall into. A staging environment is a shared resource across the org that everyone should have access to. If certain teams are using it as an environment to test out unreleased features, others may be restricted in verifying their own features that are actually released.

At Better, we make use of short-lived ephemeral environments for our CI/CD pipeline that double as “branch deploys,” allowing teams to test out experiments before releasing. For safety reasons, we do make sure ephemeral environments use their own infrastructure and resources, which keeps deployment changes insulated from the rest of our infrastructure.

Responses have been edited for length and clarity. Header image via Shutterstock. Headshots via Better.com.

Hiring Now
Restaurant365
Cloud • Information Technology • Software • Business Intelligence