Get the job you really want.
Top Reliability Engineer Jobs in Los Angeles, CA
Financial Services
As a Senior Site Reliability Engineer, you will ensure system reliability and performance while collaborating on system design, coding, and incident response.
Top Skills:
AWSAzureDockerGCPJavaKubernetesOpentelemetryPrometheusSpring Boot
Logistics • Software • Transportation
The Senior Site Reliability Engineer will design, deploy, and maintain critical systems, ensuring operational excellence, reliability, and collaboration with engineering teams. Responsibilities include metrics collection, troubleshooting, and overseeing the platform's Service Level Agreements.
Top Skills:
AnsibleAWSBashDockerKubernetesPostgresPythonTerraform
Other • Social Impact
The Staff SRE will design and maintain ML infrastructure, improve scalability and reliability, optimize system performance, and mentor team members.
Top Skills:
AnsibleArgo CdDockerElk StackGpuGrafanaHelmKubernetesMachine LearningPrometheusPythonPyTorchScikit-LearnTensorFlowTerraform
Healthtech • Professional Services • Software
The Sr. Software Engineer - Site Reliability designs software features, collaborates with teams, solves technical issues, ensures code quality, and manages incidents to maintain system performance.
Top Skills:
AIAnsibleAWSAzureElk StackGCPGrafanaKubernetesMachine LearningNew RelicTerraform
Blockchain • Information Technology • Internet of Things
As a Site Reliability Engineer, you'll enhance the reliability of backend systems through DevOps best practices, automating processes, and ensuring high availability in distributed systems.
Top Skills:
AWSAzureBashBlockchainGCPGoKubernetesPythonRustTerraform
Blockchain • Web3
As a Site Reliability Engineer at Syndica, you will maintain blockchain infrastructure, ensure reliability and performance, and utilize monitoring tools. You’ll work with teams to enhance system security and automate processes.
Top Skills:
AnsibleAWSAzureChefDatadogDockerElkGCPGoGrafanaJmeterK6KubernetesLocustNew RelicPrometheusPythonRustShellTerraformTypescript
Software
The Senior Site Reliability Engineer will enhance system reliability, automate deployments, and mentor teams while managing AWS infrastructure and incident responses.
Top Skills:
AWSBuildkiteCloudflareDatadogEcsFargateGitKafkaMikro-OrmMongoDBNestjsNode.jsPostgresReactReact NativeTerraformTypescriptVue
Big Data • Healthtech • Information Technology • Analytics
The Full Stack Platform Engineer will lead engineering efforts in cloud-native architecture and maintain healthcare analytics platforms using various technologies, while focusing on DevOps and site reliability practices.
Top Skills:
.NetAngularAzureBicepDockerGCPJavaJenkinsKubernetesSQL ServerTerraform
Featured Jobs
Software
The Senior SRE will manage the Global SRE team, enhance platform reliability, oversee incident response, and ensure operational efficiency while collaborating across departments to improve service quality.
Top Skills:
AWSLog Analysis ToolsMonitoring ToolsSite Reliability Engineering
Security • Cybersecurity
The Software Engineer II - Site Reliability will manage cloud environments, enhance infrastructure, troubleshoot production issues, and collaborate on deployment optimizations.
Top Skills:
AWSGoKafkaKubernetesPythonRedisRelational DatabasesTerraform
Software
As a Site Reliability Engineer, you will build and maintain scalable infrastructure, automate processes, and collaborate with cross-functional teams while mentoring others and owning projects end-to-end.
Top Skills:
Circle CiCloudFormationElk StackGithub ActionsGitlab CiGrafanaJenkinsKubernetesPrometheusPulumiTerraform
Software • Biotech
The Senior Site Reliability Engineer will design and maintain scalable cloud infrastructure, enforce reliability metrics, optimize spending, and lead incident management efforts while enhancing developer workflows and CI/CD processes.
Top Skills:
Ci/CdCloud InfrastructureObservability
Fintech
Ensure the reliability of critical services, collaborating with engineering to set SLOs and implement best practices while driving incident management and mentoring team members.
Top Skills:
AppdynamicsDatadogGoGrafanaPrometheusPythonSplunkTypescript
Information Technology • Security • Cybersecurity
The Principal Site Reliability Engineer ensures the smooth operation of crucial infrastructure, manages system incidents, collaborates cross-functionally, and automates tasks while maintaining service reliability and performance.
Top Skills:
AWSGitopsGrafanaHelmKubernetesLinuxPrometheusTerraform
Software • Cybersecurity
As the Site Reliability Engineer at Veriff, you'll architect and maintain services, improve operational excellence, develop CI/CD pipelines, and implement SRE best practices for reliable identity verification solutions.
Top Skills:
AWSDockerGoGrafanaKubernetesLinuxNoSQLPythonRabbitMQRedisSQLTerraform
Information Technology • Software
The Site Reliability Engineer will develop test plans, analyze data, manage risks, ensure compliance, and support lifecycle management for DoD systems.
Top Skills:
Configuration As CodeHybrid Cloud InfrastructureInfrastructure As CodeAzureService Level Agreement Monitoring
AdTech • Digital Media • Information Technology • Other
As a Site Reliability Engineer II, you will enhance monitoring and incident response systems, manage Infrastructure as Code, and collaborate with development teams to optimize application reliability and performance at scale.
Top Skills:
AWSBashCi/CdGCPGithub ActionsGoGrafanaInfrastructure As CodeJavaJenkinsKubernetesLinuxPythonSplunkTerraform
Greentech • Hardware • Real Estate • Software • Energy
Responsible for maintaining system stability and performance. Collaborate with engineering teams to automate workflows, improve reliability, and manage incident response. Conduct capacity planning and advocate for best engineering practices.
Top Skills:
AnsibleAWSBashCloud HealthCloudwatchGithub ActionsGoGrafanaHelmPrometheusPythonTerraform
Greentech • Financial Services
The Site Reliability Engineer will ensure application reliability and performance by designing automation, leading incident responses, and improving system stability in collaboration with development and operations teams.
Top Skills:
AWSDatadogGitKubernetesMongoDBSQLTerraform
Information Technology • Internet of Things • Security • Software • Cybersecurity
Responsible for designing and managing reliable infrastructure, supporting development teams, and improving operational maturity through developer-focused tools and automation.
Top Skills:
ArgocdCrossplaneGithub ActionsGoGoogle Cloud PlatformGrafanaHelmKubernetesLinuxOpentelemetryPrometheusPythonScalaTerraform
Big Data • Cloud • Marketing Tech • Social Impact • Software
Support deployment of global products, provide engineering support, drive product issue resolution, and enhance reliability monitoring while collaborating with distributed teams.
Top Skills:
AWSCircleCIGCPGoJenkinsKubernetesPythonTerraform
Big Data • Cloud • Marketing Tech • Social Impact • Software
The Senior SRE will manage global product deployments, provide engineering support, enhance CI/CD and monitoring, and maintain operational documentation.
Top Skills:
AWSCircleCIGCPGoJenkinsKubernetesPythonTerraform
Artificial Intelligence • Software • Generative AI
Lead the design and management of cloud infrastructure ensuring reliability and performance. Mentor junior engineers and automate cloud operations.
Top Skills:
AWSAzureDockerElk StackGCPGoGrafanaJavaKubernetesPrometheusPythonTerraform
Artificial Intelligence • Software • Generative AI
This role involves designing, implementing, and maintaining cloud infrastructure, ensuring system reliability, mentoring junior engineers, and optimizing system performance and security.
Top Skills:
AWSAzureDockerElk StackGCPGoGrafanaJavaKubernetesPrometheusPythonTerraform
Artificial Intelligence • Healthtech • Machine Learning • Software • Biotech
Responsible for designing, building, and operating hybrid cloud and on-prem infrastructure, implementing SRE best practices, and automation.
Top Skills:
AnsibleAWSCloudFormationDatadogEksGoGrafanaKubeadmKvmPrometheusPythonTerraform
Top Los Angeles Companies Hiring Reliability Engineers
See AllPopular Job Searches
All Filters
Total selected ()
No Results
No Results