Thrive Market Logo

Thrive Market

Staff Site Reliability Engineer

Reposted 21 Days Ago
In-Office or Remote
2 Locations
180K-225K Annually
Senior level
In-Office or Remote
2 Locations
180K-225K Annually
Senior level
Lead and define the DevOps strategy, oversee migration and architecture of Kubernetes-based platforms, and mentor engineering teams.
The summary above was generated by AI
ABOUT THRIVE MARKET 
 
Thrive Market was founded in 2014 with a mission to make healthy and sustainable living easy and affordable for everyone. As an online, membership-based market, we deliver the highest quality healthy, and sustainable products at member-only prices, while matching every paid membership with a free one for someone in need. Every day, we leverage innovative technology and member-first thinking to help our over 1,700,000+ members find better products, support better brands, and build a better world in the process. We are also a Certified B Corporation, a Public Benefit Corporation, and a Climate Neutral Certified company.
 
Join us as we bring healthy and sustainable living to millions of Americans in the years to come.
THE ROLE

We’re looking for a Staff Site Reliability Engineer to help define and build the reliability foundation for Thrive Market’s platform. You’ll be working with a first-class group of engineers to establish our SRE practice from the ground up; defining SLOs, SLIs and Error Budgets, building observability into everything we do, and creating the frameworks that ensure our systems scale reliably during our company’s rapid growth.

This is a high-impact role at an exciting inflection point. We’ve recently containerized our entire platform on Kubernetes, and we’re evaluating a potential platform migration to a next-generation ecommerce platform. You’ll be balancing hands-on reliability work with the strategic thinking needed to build systems that self-heal and get better over time.

If you’ve read books like The Google SRE Handbook, The Phoenix Project, Accelerate, The DevOps Handbook, etc., this is the right place for you! 

RESPONSIBILITIESReliability & Observability
  • Define, implement, and own Service Level Objectives (SLOs) and Service Level Indicators (SLIs) across critical platform services
  • Build and maintain comprehensive monitoring, alerting, and observability systems using tools like Datadog, Prometheus, Grafana, or similar platforms
  • Establish error budgets and use them to balance feature velocity with reliability investments
  • Lead incident response efforts, conduct blameless postmortems, and drive systemic improvements that prevent recurrence
  • Design and implement chaos engineering practices to proactively identify failure modes before they impact members
Infrastructure & Platform
  • Architect and optimize our Kubernetes-based container orchestration platform for reliability, performance, and cost efficiency
  • Support large infrastructure migrations, ensuring a smooth transition with minimal disruption to business operations
  • Contribute to the evaluation and execution of potential platform migrations, with a focus on reliability planning and risk mitigation
  • Design and implement automated deployment pipelines that enable rapid, error-free releases with feature flags and built-in rollback/roll-forward capabilities
  • Develop and own disaster recovery plans, capacity planning models, and system hardening initiatives
  • Collaborate closely with product engineering teams to help them scale their infrastructure in AWS and adopt SRE best practices
Culture & Process
  • Help establish SRE as a practice at Thrive Market, defining the team’s charter, processes, and engagement model with product engineering teams
  • Champion a culture of operational excellence, continuous improvement, and data-driven reliability decisions
  • Create and maintain technical documentation covering architecture decisions, runbooks, incident response procedures, and operational playbooks
  • Participate in weekly on-call rotations and help build sustainable on-call practices that avoid burnout
  • Identify systemic problems and inefficiencies across the engineering organization and make strategic recommendations for improvement
QUALIFICATIONSRequired
  • B.S. in Computer Science or equivalent professional experience
  • 7+ years of hands-on experience in SRE, DevOps, or Infrastructure Engineering, with a proven track record of improving reliability at rapidly growing companies
  • Deep expertise in Kubernetes (K8s) — including cluster management, Helm charts, service meshes, and production-grade container orchestration
  • Strong systems engineering background with advanced proficiency in Linux administration
  • Advanced scripting and automation skills in Bash, Python, Golang, Ruby, or similar languages
  • Extensive experience with core AWS services including EC2, ECS/EKS, S3, VPC, IAM, CloudWatch, Route 53, RDS, and Lambda
  • Strong experience with Infrastructure as Code tools (Terraform, CloudFormation, Pulumi, or similar)
  • Hands-on experience defining and implementing SLOs, SLIs, and error budgets in production environments
  • Deep understanding of CI/CD pipelines and deployment strategies (blue-green, canary, rolling deployments)
  • Expertise in monitoring and observability platforms (Datadog, Prometheus, Grafana, New Relic, or similar)
  • Strong knowledge of web application infrastructure, networking, load balancing, and security best practices
  • Excellent communication skills with the ability to lead incident response and facilitate blameless postmortems
Preferred
  • Experience with e-commerce platforms (Magento, Shopify, or comparable) and the unique reliability challenges they present at scale
  • Experience with ConcourseCI, Github Actions (GHA) or similar deployment frameworks
  • Experience with chaos engineering tools and practices (Gremlin, Litmus, Chaos Monkey, or similar)
  • Familiarity with GitOps workflows (ArgoCD, Flux) and service mesh technologies (Istio, Linkerd)
  • Experience building and managing cost-optimization strategies for cloud infrastructure
  • Background in establishing SRE practices in organizations transitioning from traditional DevOps models
  • Experience with configuration management tools (Ansible, Chef, Puppet, or similar)
BELONG TO A BETTER COMPANY
  • Comprehensive health benefits (medical, dental, vision, life and disability)
  • Competitive salary (DOE) + equity
  • 401k plan
  • 9 Observed Holidays
  • Flexible Paid Time Off
  • Subsidized ClassPass Membership with access to fitness classes and wellness and beauty experiences
  • Ability to work in our beautiful office in Playa Vista
  • Free Thrive Market membership with exclusive employee discount
  • Coverage for Life Coaching & Therapy Sessions on our holistic mental health and well-being platform
We're a community of more than 1 Million + members who are united by a singular belief: It should be easy to find better products, support better brands, make better choices, and build a better world in the process.
 
At Thrive Market, we believe in building a diverse, inclusive, and authentic culture. If you are excited about this role along with our mission and values, we encourage you to apply.
 
Thrive Market is an EEO/Veterans/Disabled/LGBTQ employer
 
At Thrive Market, our goal is to be a diverse and inclusive workplace that is representative, at all job levels, of the members we serve and the communities we operate in. We’re proud to be an inclusive company and an Equal Opportunity Employer and we prohibit discrimination and harassment of any kind. We believe that diversity and inclusion among our teammates is critical to our success as a company, and we seek to recruit, develop and retain the most talented people from a diverse candidate pool. If you’re thinking about joining our team, we expect that you would agree!
 
Employment with Thrive Market requires that employees be based in the United States. This is a condition of employment and must be maintained throughout the duration of employment.
 
If you need assistance or accommodation due to a disability, please email us at [email protected] and we’ll be happy to assist you.
 
Ensure your Thrive Market job offer is legitimate and don't fall victim to fraud. Thrive Market never seeks payment from job applicants. Thrive Market recruiters will only reach out to applicants from an @thrivemarket.com email address. For added security, where possible, apply through our company website at www.thrivemarket.com.

© Thrive Market 2026 All rights reserved.

JOB INFORMATION
  • Compensation Description - The base salary range for this position is $180,000 - $225,000/Per Year.
  • Compensation may vary outside of this range depending on several factors, including a candidate’s qualifications, skills, competencies and experience, and geographic location.
  • Total Compensation includes Base Salary, Stock Options, Health & Wellness Benefits, Flexible PTO, and more!
  • This position requires traveling to our HQ office in Los Angeles, California, twice a year for all-company summits; once in the summer and once in the winter.

Top Skills

Ansible
AWS
Bash
Chef
CloudFormation
Datadog
Go
Grafana
Kubernetes
Prometheus
Puppet
Python
Ruby
Terraform
HQ

Thrive Market Los Angeles, California, USA Office

In a commitment to our Remote-First Workforce, we have downsized from our previous offices to a WeWork!

Similar Jobs at Thrive Market

An Hour Ago
In-Office or Remote
2 Locations
130K-170K Annually
Senior level
130K-170K Annually
Senior level
Consumer Web • eCommerce • Food • Healthtech • Natural Language Processing • Social Impact
The Senior Data Analyst will lead analytics projects informing operations, design dashboards, analyze data, and collaborate with teams to drive growth and efficiency at Thrive Market.
Top Skills: DomoHiveLookerPythonSnowflakeSparkSQLTableau
Yesterday
Remote
USA
35-35 Hourly
Mid level
35-35 Hourly
Mid level
Consumer Web • eCommerce • Food • Healthtech • Natural Language Processing • Social Impact
The Affiliate Partnerships Contractor will manage and expand the creator-driven affiliate program, focusing on recruitment, relationship-building, and performance optimization. Responsibilities include scaling the program, negotiating partnerships, and analyzing performance data to drive conversions.
3 Days Ago
In-Office or Remote
2 Locations
175K-190K Annually
Senior level
175K-190K Annually
Senior level
Consumer Web • eCommerce • Food • Healthtech • Natural Language Processing • Social Impact
The Senior Data Scientist will lead data science initiatives, collaborate with various teams, and drive business outcomes through data analysis and modeling.
Top Skills: Aws SagemakerDockerEcrKubernetesLambdaPythonS3SQL

What you need to know about the Los Angeles Tech Scene

Los Angeles is a global leader in entertainment, so it’s no surprise that many of the biggest players in streaming, digital media and game development call the city home. But the city boasts plenty of non-entertainment innovation as well, with tech companies spanning verticals like AI, fintech, e-commerce and biotech. With major universities like Caltech, UCLA, USC and the nearby UC Irvine, the city has a steady supply of top-flight tech and engineering talent — not counting the graduates flocking to Los Angeles from across the world to enjoy its beaches, culture and year-round temperate climate.

Key Facts About Los Angeles Tech

  • Number of Tech Workers: 375,800; 5.5% of overall workforce (2024 CompTIA survey)
  • Major Tech Employers: Snap, Netflix, SpaceX, Disney, Google
  • Key Industries: Artificial intelligence, adtech, media, software, game development
  • Funding Landscape: $11.6 billion in venture capital funding in 2024 (Pitchbook)
  • Notable Investors: Strong Ventures, Fifth Wall, Upfront Ventures, Mucker Capital, Kittyhawk Ventures
  • Research Centers and Universities: California Institute of Technology, UCLA, University of Southern California, UC Irvine, Pepperdine, California Institute for Immunology and Immunotherapy, Center for Quantum Science and Engineering

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account