BrightHire Logo

BrightHire

Senior Site Reliability Engineer

Posted 15 Days Ago
Remote
Hiring Remotely in USA
Senior level
Remote
Hiring Remotely in USA
Senior level
The Senior Site Reliability Engineer will ensure the reliability and performance of critical systems by improving observability, database performance, Kubernetes management, and CI/CD pipelines, while enhancing developer experience and infrastructure.
The summary above was generated by AI

BrightHire is a category-creating, high-growth, Series B software company with a mission to give everyone the hiring experience they deserve.

We deliver on this mission by transforming the way many of the world’s leading companies build exceptional teams. We created the Interview Intelligence category, and our clients include some of the world’s most innovative companies—Canva, OpenAI, Ramp, Hubspot—up to the Fortune 500.

Location

Remote - USA

About the Role

You will own the end-to-end reliability and performance of many of our most critical systems. Working in lockstep with Product and Engineering, you will design, build, and refine the platform that our application and AI features run on, from Kubernetes and databases through CI/CD and observability. You will focus on keeping our systems fast, reliable, and easy for developers to work with. You will work on real infrastructure that supports features people use every day—things like:

  • Continuing to improve and iterate on our observability stack that includes Kibana, Grafana, OTel, and Elastic.
  • Database performance improvements by analyzing slow and high-volume queries, tuning indexes, optimizing query patterns and timing, and recommending schema and code changes to keep QPS and latency low.
  • Kubernetes improvements and upgrades, including deploying new services, improving resource utilization, tightening security, and standardizing deployment patterns across teams.
  • Improving CI/CD pipelines for both backend and frontend services so engineers can ship quickly and safely, with clear feedback loops, fast build times, and reliable rollbacks.
  • Enhancing the local developer experience so that running and debugging the app locally feels fast, consistent, and representative of production.
  • Helping improve our CI/CD and observability for our ML pipeline and models, bringing MLOps best practices into our existing infrastructure.
What You’ll Bring
  • You have real-world experience running production systems and doing SRE, Platform, or DevOps work for web applications or APIs.
  • You are comfortable working across Kubernetes, CI/CD, databases, and backend services, and you enjoy owning problems end to end.
  • You have strong experience with Kubernetes in production environments, including cluster upgrades, workload deployments, scaling, and debugging.
  • You have experience with observability stacks (such as Elasticsearch and Kibana, Prometheus, Grafana, or similar) and can lead efforts like upgrading Kibana to new major versions and improving logs, metrics, and dashboards.
  • You have worked deeply with relational databases and SQL, know how to profile slow queries, design and tune indexes, and work with engineers to adjust query patterns, timing, and frequency to improve performance.
  • You are comfortable in at least one backend language (i.e. Python) and can read and modify application code to support infra and performance improvements.
  • You have experience improving CI/CD pipelines, including build and test speed, deployment workflows, and release strategies (such as blue/green or canary).
  • You have worked with infrastructure-as-code tools or similar patterns to manage environments in a repeatable way.
  • You think deeply about developer experience and reliability and use both metrics and empathy to guide your decisions.
  • You care about security, resiliency, and cost as integral aspects of the systems you build and manage.
  • You move fast and independently, but you know when to pull in teammates for pairing, reviews, or cross-team alignment.
About our team
  • You’ll have the opportunity to work on high-impact projects in small, autonomous squads, with the flexibility to lead initiatives or focus as an individual contributor depending on your goals and interests.
  • Our developer experience is thoughtfully designed, with fast CI (< 10 minutes), 1-click deploys, strong observability, and a clean codebase that enables you to move quickly and confidently.
  • Our culture supports sustainable, focused work with fully remote roles, regular working hours, no-meeting Wednesdays, and flexible time off to recharge when needed.
  • Our team is composed of smart, collaborative, and genuinely kind people, creating an environment where you can learn, grow, and do your best work.
Equal Employment Opportunity (EEO) Statement

Our company does not discriminate in employment on the basis of race, color, religion, sex (including pregnancy and gender identity), national origin, political affiliation, sexual orientation, marital status, disability, genetic information, age, membership in an employee organization, retaliation, parental status, military service, or other non-merit factor.

*Note to Recruiters and Placement Agencies: We do not accept unsolicited agency resumes. Please do not forward unsolicited agency resumes to our website. We will not pay fees to any third party agency or firm and will not be responsible for any agency fees associated with unsolicited resumes. Unsolicited resumes received will be considered our property.
 

Top Skills

Ci/Cd
Elasticsearch
Grafana
Kibana
Kubernetes
Prometheus
Python
SQL

Similar Jobs

22 Days Ago
Remote
United States of America
148K-195K Annually
Mid level
148K-195K Annually
Mid level
Blockchain • Fintech • Payments • Financial Services • Cryptocurrency • Web3
The Site Reliability Engineer will build and maintain infrastructure, improve software systems, develop scalable microservices, and ensure quality software delivery.
Top Skills: AWSGoGoogle Cloud PlatformJavaKubernetesAzureSQL
Yesterday
Remote
USA
125K-169K Annually
Senior level
125K-169K Annually
Senior level
Healthtech
As a Senior Site Reliability Engineer, you will design and operate AWS infrastructure, optimize CI/CD pipelines, manage observability with Datadog, and automate using Python. You will also document processes and actively participate in Agile workflows.
Top Skills: AWSDatadogEksGithub ActionsIamPythonRbacTerraformTerragrunt
Yesterday
Easy Apply
Remote
United States
Easy Apply
150K-170K Annually
Senior level
150K-170K Annually
Senior level
Fintech • Financial Services
The role involves optimizing system reliability and scalability in cloud environments, automating operational excellence, and mentoring SRE teams. Key responsibilities include defining SLOs, managing error budgets, and developing automated solutions for Azure infrastructure.
Top Skills: ArgocdAtlantisAzureAzure DevopsCi/CdGithub ActionsKubernetesService MeshTerraform

What you need to know about the Los Angeles Tech Scene

Los Angeles is a global leader in entertainment, so it’s no surprise that many of the biggest players in streaming, digital media and game development call the city home. But the city boasts plenty of non-entertainment innovation as well, with tech companies spanning verticals like AI, fintech, e-commerce and biotech. With major universities like Caltech, UCLA, USC and the nearby UC Irvine, the city has a steady supply of top-flight tech and engineering talent — not counting the graduates flocking to Los Angeles from across the world to enjoy its beaches, culture and year-round temperate climate.

Key Facts About Los Angeles Tech

  • Number of Tech Workers: 375,800; 5.5% of overall workforce (2024 CompTIA survey)
  • Major Tech Employers: Snap, Netflix, SpaceX, Disney, Google
  • Key Industries: Artificial intelligence, adtech, media, software, game development
  • Funding Landscape: $11.6 billion in venture capital funding in 2024 (Pitchbook)
  • Notable Investors: Strong Ventures, Fifth Wall, Upfront Ventures, Mucker Capital, Kittyhawk Ventures
  • Research Centers and Universities: California Institute of Technology, UCLA, University of Southern California, UC Irvine, Pepperdine, California Institute for Immunology and Immunotherapy, Center for Quantum Science and Engineering

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account