Top Remote Site Reliability Engineer Jobs in Los Angeles, CA

Reposted 20 Hours AgoSaved
Remote
USA
150K-220K Annually
Senior level
150K-220K Annually
Senior level
Artificial Intelligence • Machine Learning • Natural Language Processing • Software • Conversational AI
The engineer will build and operate AI/ML infrastructure, managing services on AWS and bare metal, using tools like Kubernetes and Terraform.
Top Skills: AWSBashGoKubernetesPythonSlurmTerraform
Reposted 20 Hours AgoSaved
Easy Apply
Remote or Hybrid
United States
Easy Apply
127K-249K Annually
Senior level
127K-249K Annually
Senior level
Big Data • Cloud • Software • Database
As a Senior Site Reliability Engineer, you'll design and build complex systems, support Atlas platform operations, automate processes, and ensure high availability of services.
Top Skills: AWSAzureDnsGCPGoHTTPLinuxPythonRubyTls
Reposted 2 Days AgoSaved
Easy Apply
Remote or Hybrid
US
Easy Apply
200K-230K Annually
Senior level
200K-230K Annually
Senior level
Artificial Intelligence • Machine Learning
Lead development of AI-assisted reliability tooling, own incident response end-to-end, improve observability and SLO/SLI frameworks, scale single-tenant SaaS operations, mentor engineers, and reduce recurring operational toil through engineering and automation.
Top Skills: Cloud PlatformsGoKubernetesLinuxLlm/Ai ToolingLogs And TracingObservability ToolingPythonSlo/Sli Frameworks
Reposted 2 Days AgoSaved
Remote
United States
223K-302K Annually
Expert/Leader
223K-302K Annually
Expert/Leader
Artificial Intelligence • Cloud • Consumer Web • Productivity • Software • App development • Data Privacy
The role involves defining reliability strategies, leading initiatives across teams, enhancing monitoring and incident response, and mentoring engineers at Dropbox.
Top Skills: Ai TechnologiesDebuggingDistributed SystemsIncident ResponseObservabilityReliability Risk ManagementSlasSlos
3 Days AgoSaved
Remote or Hybrid
United States
200K-250K Annually
Senior level
200K-250K Annually
Senior level
Digital Media • Gaming • Information Technology • Software • Sports • Esports • Big Data Analytics
Lead long-term strategy and architecture for cloud and on‑prem platform infrastructure, driving Kubernetes and multi‑cloud reliability, IaC/GitOps automation, observability, SLO/SLI/error‑budget practices, incident leadership, AI‑augmented tooling adoption, and mentorship of senior engineers to improve platform resilience and developer experience.
Top Skills: Amazon Elastic Kubernetes Service (Eks)AutoscalingAWSCapacity PlanningCi/CdGitopsGoGoogle Cloud PlatformGoogle Kubernetes Engine (Gke)Identity And Access ManagementInfrastructure As CodeKubernetesLinuxNetworkingObservabilityOperatorsPulumiPythonRke2StorageTerraform
Reposted 9 Days AgoSaved
Easy Apply
Remote or Hybrid
United States
Easy Apply
126K-248K Annually
Senior level
126K-248K Annually
Senior level
Big Data • Cloud • Software • Database
The Senior Site Reliability Engineer will develop and support distributed storage services, ensuring reliability and operational safety, with a focus on automation and efficiency.
Top Skills: AWSAzureDnsGoGoogle Cloud PlatformKubernetesLinuxPythonTcp/IpTls
Reposted 9 Days AgoSaved
Easy Apply
Remote or Hybrid
United States
Easy Apply
127K-249K Annually
Expert/Leader
127K-249K Annually
Expert/Leader
Big Data • Cloud • Software • Database
Seeking a Site Reliability Engineer with expertise in networking and distributed systems for building secure multi-cloud infrastructure. Responsibilities include maintaining network architecture and ensuring reliable service-to-service communication, involving a 24/7 on-call rotation.
Top Skills: AWSAzureBgpDnsGCPIpv6KubernetesLoad BalancingMtlsService MeshTcp/IpTlsVpcsVpns
Reposted 19 Days AgoSaved
Easy Apply
Remote or Hybrid
USA
Easy Apply
Internship
Internship
Cloud • Information Technology • Security • Software • Cybersecurity
This internship role focuses on SRE skills, requiring collaboration and problem-solving in dynamic environments for Zscaler's Zero Trust Exchange team.
Top Skills: AnsibleAws EcsKubernetesLinuxPythonTerraform
Reposted 19 Days AgoSaved
Easy Apply
Remote
US
Easy Apply
Mid level
Mid level
Cloud • Security • Software • Cybersecurity • Automation
As a Cloud Cost Utilization SRE at GitLab, you'll manage cloud spending, improve tracking and optimization of cloud usage, and collaborate with finance and engineering teams to enhance cost efficiency across AWS and GCP.
Top Skills: AnsibleAWSElkGCPGrafanaLokiMimirPrometheusTempoTerraform
Reposted 2 Days AgoSaved
In-Office or Remote
3 Locations
140K-205K Annually
Senior level
140K-205K Annually
Senior level
Information Technology • Legal Tech
The Senior Technology Site Reliability Engineer is responsible for maintaining and optimizing infrastructure and applications, ensuring reliability and performance while automating processes and collaborating with teams.
Top Skills: AWSChefDatadogGoGrafanaJavaPrometheusPuppetPythonSaltTerraform
Reposted 22 Days AgoSaved
Easy Apply
Remote or Hybrid
United States
Easy Apply
127K-249K Annually
Senior level
127K-249K Annually
Senior level
Big Data • Cloud • Software • Database
The Senior Site Reliability Engineer will lead security design and implementation for cloud infrastructures, mentor teams, and automate security solutions.
Top Skills: AnsibleAWSAzureCloud Security ToolsCloudFormationGCPGoTerraform
24 Days AgoSaved
Easy Apply
Remote
USA
Easy Apply
218K-257K Annually
Senior level
218K-257K Annually
Senior level
Artificial Intelligence • Blockchain • Fintech • Financial Services • Cryptocurrency • NFT • Web3
Own reliability, monitoring, and incident response for AI infrastructure; build automation and CI/CD tooling; manage Kubernetes/Docker production workloads; partner with infrastructure, security, and compliance; improve observability and documentation; develop internal full‑stack tooling in Go or Python.
Top Skills: AnsibleAWSBashChefCi/CdDockerEc2GitGoKubernetesLinuxLog AggregationNetwork SecurityPuppetPythonRubySaltTerraform
New

Track Smarter, Apply Better.

Ditch the spreadsheets. Organize your job search with our freeApplication Tracker.

Use For Free
Application Tracker Preview
Reposted 2 Days AgoSaved
Remote or Hybrid
United States
190K-235K Annually
Senior level
190K-235K Annually
Senior level
HR Tech • Information Technology • Professional Services • Sales • Software
Own and operate production-grade Kubernetes infrastructure on AWS, build GitOps CI/CD with GitHub Actions and ArgoCD, develop AI agents and internal DevOps tooling, maintain Datadog-based observability, and manage on-call incident response while collaborating with engineering teams to improve reliability and delivery speed.
Top Skills: Ai/LlmArgocdAWSCi/CdDatadogGithub ActionsGitopsGoKubernetesPython
Reposted 4 Days AgoSaved
Easy Apply
Remote or Hybrid
United States
Easy Apply
127K-249K Annually
Senior level
127K-249K Annually
Senior level
Big Data • Cloud • Software • Database
Develop and maintain Kubernetes runtime environments, support developers, resolve critical issues, and participate in on-call rotations for production systems.
Top Skills: AWSAzureCert-ManagerCorednsCrdsCriCsiGatekeeperGCPGoHelmKubernetesKustomizeOperatorsPythonTerraform
Reposted 20 Hours AgoSaved
Remote
United States
115K-135K Annually
Mid level
115K-135K Annually
Mid level
Aerospace • Manufacturing
As a Site Reliability Engineer, you'll build and manage observability platforms for satellite communications, define SLOs/SLIs, and collaborate on incident response and deployment automation.
Top Skills: ArgocdAWSElkGCPGoGrafanaIstioJaegerKubernetesLinkerdLokiOpentelemetryPrometheusPythonTempoTerraform
Reposted 20 Hours AgoSaved
Remote
USA
Senior level
Senior level
Artificial Intelligence • Fintech • Software • Financial Services
The SRE will own reliability for a cloud-native platform, optimizing performance, availability, and observability, while mentoring engineering teams.
Top Skills: AWSClickhouseGoKafkaKubernetesPulumiPythonTerraform
Reposted YesterdaySaved
Remote
United States
Mid level
Mid level
Blockchain • Software
Build, operate, and scale production Kubernetes infrastructure using GitOps and declarative IaC. Design CI/CD workflows, observability, and secure-by-default systems. Troubleshoot networking/storage, participate in on-call rotations, automate operational workflows, and drive postmortems and reliability improvements.
Top Skills: ArbitrumArgocdArgocd ApplicationsetsAWSAzureBashCloudwatchCodebuildGCPGithub ActionsGitopsGoGrafanaK9SKubernetesLinuxLokiMimirPrometheusPrysmPythonTerraformYamlZerodev
Reposted YesterdaySaved
Remote
United States
Senior level
Senior level
Automotive
Design and implement scalable cloud infrastructure, monitor performance, automate processes, ensure security and compliance, and lead a DevOps team.
Top Skills: AWSBashCi/CdDockerElk StackGCPGrafanaKubernetesPrometheusPythonTerraform
2 Days AgoSaved
Remote
USA
Senior level
Senior level
Software • Web3
Lead reliability practices across teams: embed early in projects, define SLIs/SLOs, build multi-cloud paved roads with Terraform, run on-call, drive org-wide incident maturity and tooling.
Top Skills: AWSAzureGCPRuby On RailsTerraformTypescriptWebcontainers
2 Days AgoSaved
Remote
2 Locations
124K-171K Annually
Senior level
124K-171K Annually
Senior level
Healthtech • Pharmaceutical • Manufacturing
Support and maintain production Core Speech systems: deploy, monitor, alert, perform capacity planning, respond to on-call incidents, and drive system performance and architecture improvements.
Top Skills: AnsibleAws CloudfrontAws DocumentdbAws Ec2Aws EfsAws EksAws RdsAws S3ContainerdDockerElasticsearchFilebeatGitGitGitlabGoGocdGrafanaJavaJythonKibanaKubernetesLogstashMongoDBPostgresPythonRedisShellSolrTerraform
Reposted 2 Days AgoSaved
In-Office or Remote
United States
200K-200K Annually
Mid level
200K-200K Annually
Mid level
Cloud • Software
The Site Reliability Engineer will ensure reliable cloud operations by applying Python for infrastructure automation, managing OpenStack and Kubernetes, and practicing devsecops in a fast-paced environment.
Top Skills: KubernetesLinuxOpenstackPython
Reposted 2 Days AgoSaved
In-Office or Remote
United States
95K-171K Annually
Junior
95K-171K Annually
Junior
Cloud • Security • Software • Cybersecurity
As a Site Reliability Engineer II, you'll automate tasks, monitor AI workloads, enhance dashboards, support CI/CD processes, and collaborate with engineering teams on complex issues while participating in on-call rotations.
Top Skills: GoGrafanaKubernetesLinuxPrometheusPythonSaltstackTerraform
Reposted 2 Days AgoSaved
Remote
USA
Mid level
Mid level
Software • Analytics
The role involves automating and managing AWS infrastructure, ensuring reliability and scalability of stateful systems, and optimizing deployment processes. You'll also handle incident responses and improve operational tooling.
Top Skills: AWSKubernetesTerraformTerragrunt
Reposted 2 Days AgoSaved
Remote
United States
220K-250K Annually
Expert/Leader
220K-250K Annually
Expert/Leader
Cloud • Software • Database
Lead design, build, and operate the YugabyteDB DBaaS infrastructure. Drive architecture, automate lifecycle and maintenance, manage incidents and on-call rotations, implement security/encryption processes, and optimize reliability using SRE principles and observability.
Top Skills: AksAnsibleAWSAzureBashDockerEksGCPGitGithub ActionsGkeJavaKubernetesLinuxPostgresPrometheusPythonShellTerraform
Reposted 2 Days AgoSaved
Remote
United States
133K-211K Annually
Mid level
133K-211K Annually
Mid level
Cloud • Security • Software • Generative AI
Design, build, and automate large-scale multi-cloud infrastructure and internal SRE tools. Improve host lifecycle, observability, alerting, and reliability; operate containerized workloads; participate in on-call rotations, incident response, runbooks, postmortems, code reviews, and mentoring.
Top Skills: AnsibleArgo CdArgo WorkflowsCueDockerElastic StackGoGraphiteInfluxKubernetesLinuxPrometheusPuppetTerraformUbuntuUbuntu Live Patch
All Filters
JobType
New Jobs
Job Category
Experience
Industry
Company Name
Company Size

Sign up now Access later

Create Free Account