Great Question Logo

Great Question

Site Reliability Engineer

Posted 16 Days Ago
Remote
Hiring Remotely in United States
Senior level
Remote
Hiring Remotely in United States
Senior level
As a Site Reliability Engineer, you will own platform health, improve observability, manage cloud costs, and enhance developer experience in a dynamic startup environment.
The summary above was generated by AI
🚀 About Us

We’re a product-focused startup with a tight-knit team of 14 engineers building tools that help teams make better decisions through great research. We're pragmatic, fast-moving, and obsessed with product quality.

As we grow, our infrastructure needs to grow with us. That means better observability, stronger systems, faster deploys—and smarter decisions about cloud spend. We’re hiring someone who can take ownership of this and lay the foundation for long-term platform health.

🎯 What You’ll Do

You’ll be the first dedicated DevOps/Infra hire with end-to-end ownership of platform health, reliability, and scalability. You’ll partner directly with our engineering team to improve our systems, reduce toil, and make infra a product in its own right.

Your scope will include:

  • Observability, Reliability, Availability

    • Define and maintain service SLOs, dashboards, and alerts

    • Improve incident detection and response

    • Lead incident postmortems, share learnings, and manage follow-up actions

  • Infrastructure

    • Maintain and improve Terraform-managed infrastructure

    • Lead our migration of staging infrastructure to AWS

    • Optimize our use of tools like Datadog, Sentry, and others

  • Capacity Planning & Performance Optimization

    • Identify current and potential future bottlenecks

    • Collaborate with engineers to fine-tune application and infrastructure performance

    • Implement automated and semi-automated scaling strategies to handle growth and evolving workloads

  • Developer Experience & CI/CD

    • Increase pipeline reliability and performance

    • Design & implement load testing strategies as we scale

  • Security & Compliance

    • Work with the CTO in owning and implementing SOC2 compliance protocols and requirements

    • Help foster a security-first culture by promoting best practices and secure-by-default tooling

    • Implement guardrails and additional security tools as needed

  • Cloud Cost Management

    • Monitor and optimize cloud spend

    • Build visibility and tooling to help teams make cost-aware decisions

💡 You Might Be a Great Fit If You...
  • Have 4–8+ years of experience in DevOps, SRE, or Infrastructure roles

  • Have hands-on AWS experience (EC2, RDS, VPCs, etc.)

  • Are confident with Terraform, GitHub Actions, Docker, and PostgreSQL

  • Have a track record of improving observability and reducing incident response times

  • Have worked in high-autonomy, high-ownership environments

  • Are cost-conscious and can identify waste in infra and cloud spend

  • Love building leverage tools for engineers—infra as a product

📈 Growth Path

This is a foundational hire. Today, the role is fully IC, but there’s clear runway to grow into:

  • Platform leadership (tech lead or manager)

  • Head of Infra/SRE if we expand the team

  • Principal engineer focused on scale, reliability, and platform strategy

You’ll have support and visibility from leadership, and the freedom to chart your path as the company grows.

⚙️ Our Stack
  • Cloud: AWS

  • Infra-as-code: Terraform
    CI/CD: GitHub Actions

  • Containers: Docker, lightweight Kubernetes
    Monitoring: Datadog, Sentry

  • Database: PostgreSQL, Redis

  • App: Rails, React, Sidekiq

✨ Why This Role?
  • Impact: You’ll shape the systems and culture of how we build and run software.

  • Trust: High autonomy and low process—make smart decisions, move fast.

  • People: No egos, just a team that values thoughtfulness, speed, and care.

  • Growth: Opportunity to grow with the company in whichever direction excites you.


Top Skills

AWS
Docker
Github Actions
Postgres
Ruby on Rails
React
Redis
Sidekiq
Terraform

Similar Jobs

Yesterday
Remote or Hybrid
Colorado Springs, CO, USA
Senior level
Senior level
Artificial Intelligence • Big Data • Information Technology • Software
Design, implement, and maintain multi-tenant SaaS infrastructure, ensuring reliability, security, and scalability. Collaborate on incident response and system monitoring.
Top Skills: ArgocdAWSBashDatadogGithub ActionsGitlab CiGoGrafanaKubernetesLinuxPrometheusPythonTerraform
2 Days Ago
Remote or Hybrid
USA
110K-137K Annually
Senior level
110K-137K Annually
Senior level
Artificial Intelligence • Healthtech • Machine Learning • Natural Language Processing • Software
The Site Reliability Engineer will manage AWS infrastructure, implementing cloud strategies, automation tools, and ensuring reliability, security, and cost-efficiency.
Top Skills: AnsibleAnsibleApi GatewayArgo CdAWSAws CdkAws CloudtrailAws CloudwatchBashCloudFormationCloudfrontDockerDocumentdbEc2EksGitlabGrafanaHashicorp VaultHelmKubernetesLambdaLokiMimirNew RelicPrometheusPythonRdsS3Secrets ManagerSsmTempoTerraform
2 Days Ago
In-Office or Remote
7 Locations
Mid level
Mid level
Blockchain • Internet of Things • Payments • Cryptocurrency • Web3
As an SRE Cloud Efficiency Engineer, you will optimize cloud infrastructure costs, ensure resource efficiency, and improve cloud resource utilization across AWS and GCP, collaborating with finance and engineering teams to align cloud expenditures with budget goals.
Top Skills: ArgocdAWSGCPGithub ActionsGrafanaKubernetesTerraform

What you need to know about the Los Angeles Tech Scene

Los Angeles is a global leader in entertainment, so it’s no surprise that many of the biggest players in streaming, digital media and game development call the city home. But the city boasts plenty of non-entertainment innovation as well, with tech companies spanning verticals like AI, fintech, e-commerce and biotech. With major universities like Caltech, UCLA, USC and the nearby UC Irvine, the city has a steady supply of top-flight tech and engineering talent — not counting the graduates flocking to Los Angeles from across the world to enjoy its beaches, culture and year-round temperate climate.

Key Facts About Los Angeles Tech

  • Number of Tech Workers: 375,800; 5.5% of overall workforce (2024 CompTIA survey)
  • Major Tech Employers: Snap, Netflix, SpaceX, Disney, Google
  • Key Industries: Artificial intelligence, adtech, media, software, game development
  • Funding Landscape: $11.6 billion in venture capital funding in 2024 (Pitchbook)
  • Notable Investors: Strong Ventures, Fifth Wall, Upfront Ventures, Mucker Capital, Kittyhawk Ventures
  • Research Centers and Universities: California Institute of Technology, UCLA, University of Southern California, UC Irvine, Pepperdine, California Institute for Immunology and Immunotherapy, Center for Quantum Science and Engineering

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account