Get the job you really want.
Maximum of 25 job preferences reached.
Top Reliability Engineer Jobs in Los Angeles, CA
Other • Social Impact
The Senior Site Reliability Engineer will design and maintain infrastructure, ensure system reliability, participate in on-call rotations, and mentor peers in a collaborative remote environment.
Top Skills:
AnsibleDockerGerritGitlabKubernetesMediawikiPuppetPythonSpicerackTerraform
Blockchain • Fintech • Internet of Things • Cryptocurrency • Web3
As a Senior Site Reliability Engineer, you'll design and operate cloud infrastructure, manage Kubernetes environments, implement Infrastructure as Code, and automate processes to ensure reliability and performance.
Top Skills:
Amazon RdsAuroraAWSBashCdkCrossplaneGoKubernetesPythonTerraform
Information Technology
As a Database Reliability Engineer, you will maintain and improve PostgreSQL infrastructure, resolve production incidents, collaborate with developers, and implement infrastructure as code.
Top Skills:
Ci/CdGithub ActionsGrafanaMySQLPostgresPrometheusSaltSQLTerraform
Security • Cybersecurity
Lead initiatives for reliability and operational excellence, mentor engineers, and define goals to improve system reliability and productivity.
Top Skills:
AWSAzureDatadogGCPGoGrafanaKubernetesPrometheusPythonTerraform
Blockchain • Software
As a Site Reliability Engineer at Offchain Labs, you will manage infrastructure in cloud environments, design CI/CD workflows, and enhance system reliability with a focus on blockchain technology.
Top Skills:
ArgocdAWSAzureCodebuildGCPGithub ActionsGoGrafanaKubernetesLokiPrometheusPythonTerraform
Cloud • Greentech • Other • Energy
As a Senior Site Reliability Engineer, you'll optimize virtualization and kernel-level performance for AI workloads, develop automation tools, and support compute infrastructure, ensuring scalability and reliability.
Top Skills:
CCi/CdGoInfrastructure As CodeKvmLinuxQemuRust
Other • Social Impact
Design, develop, and maintain machine learning infrastructure while enhancing reliability and scalability, mentoring team members and collaborating across teams.
Top Skills:
AnsibleArgo CdDockerElk StackGpu AccelerationGrafanaHelmKubernetesPrometheusPythonPyTorchScikit-LearnTensorFlowTerraform
Generative AI
The Senior Site Reliability Engineer will enhance system reliability, manage cloud infrastructure, and enforce best SRE practices while mentoring juniors.
Top Skills:
AWSElk StackGrafanaKubernetesTerraform
Software
The Site Reliability Engineer will enhance system reliability, improve tooling, oversee incident processes, and collaborate on software maintenance across distributed systems.
Top Skills:
ClickhouseGrpcKafkaMongoDBNoSQLPostgresRedpanda
Automotive • Software
The Senior Site Reliability Engineer will optimize platform reliability, manage Kubernetes production clusters, deploy monitoring solutions, collaborate on resource optimization, and participate in on-call rotations.
Top Skills:
AndroidArgocdAWSCircleCIDockerGCPGitGoGrafanaKafkaKubernetesLokiNew RelicObjective-COpentelemetryPostgresPrometheusPythonReact/ReduxRedisRedshiftRuby On RailsSentrySwiftTerraformThanos
Information Technology • Software
As a Senior Site Reliability Engineer, you'll ensure the reliability, performance, and scalability of Ditto's cloud infrastructure, lead incident management, and improve system resilience.
Top Skills:
AWSAzureCDatadogGCPGoGrafanaHelmJavaKubernetesPrometheusRustTerraform
Artificial Intelligence • Computer Vision • Hardware • Robotics • Metaverse
Design, build, and maintain large-scale production systems for machine learning applications, ensuring reliability, scalability, and efficiency.
Top Skills:
Ci/CdElkGithub ActionsGoJenkinsKafkaKubernetesOpenstackPerlPrometheusPythonRubySpark
New
Track Smarter, Apply Better.
Ditch the spreadsheets. Organize your job search with our freeApplication Tracker.
Use For Free
Artificial Intelligence • Information Technology • Consulting
As a Senior Site Reliability Engineer, you will enhance the reliability and performance of our inference platform, leveraging Kubernetes and Terraform while ensuring smooth scalability of systems under load.
Top Skills:
BashGrafanaKubernetesMlopsPrometheusPythonRayTerraformTritonVllm
Cloud • Greentech • Other • Energy
As a Staff Site Reliability Engineer focused on storage, you'll ensure the reliability and performance of cloud storage systems while optimizing distributed, fault-tolerant architectures for AI workloads.
Top Skills:
AnsibleCCephDockerGlusterfsGoIscsiJavaKubernetesNfsNvme-OfOpenebsPuppetPythonSmbTerraform
Blockchain • Information Technology • Internet of Things
The Site Reliability Engineer will ensure system reliability, security, and performance by implementing infrastructure as code, CI/CD, and monitoring solutions.
Top Skills:
AWSAzureBashGCPGoKubernetesPythonRustTerraform
News + Entertainment
As a Site Reliability Engineer at Netflix, you'll enhance gaming platform reliability, manage incidents, build detection tools, and improve operational excellence.
Top Skills:
AWSGCPGoJavaJavaScriptLinuxNode.jsPrestoPythonSpark SqlTrinoUnix
Software • Cryptocurrency
Manage and scale Kubernetes clusters, automate infrastructure, optimize performance, maintain blockchain nodes, and improve system reliability while collaborating with product teams.
Top Skills:
Aws (Ec2Aws EksDatadogDockerIam)KubernetesOpentelemetryPulumiRdsS3Terraform
Cloud • Security • Software
As a Site Reliability Engineer II, you will architect, deploy, and maintain resilient infrastructure on AWS, develop deployment pipelines, and manage performance issues across distributed systems.
Top Skills:
AWSCloudwatchDockerGitGrafanaJenkinsKubernetesNew RelicPuppetTerraform
Big Data • Cloud • Marketing Tech • Social Impact • Software
The Site Reliability Engineer will support global product deployments, provide 24/7 engineering support, enhance CI/CD tooling, and ensure security compliance.
Top Skills:
AWSCircleCIGCPGoJenkinsKubernetesPythonTerraform
Artificial Intelligence • Marketing Tech • Mobile • Software
Design and implement solutions for platform reliability and scalability, lead cross-team projects, and mentor team members while ensuring operational excellence.
Top Skills:
AirflowAWSCloudflareDatadogDynamoDBEsbuildGradleGraphQLHelmHuggingfaceIstioJavaKinesisKubernetesMetaflowPandasPlanetscalePlaywrightPostgresPythonPyTorchRadix UiReactRedisSpring BootStorybookTensorFlowTerraformTypescriptVite
Aerospace • Other
The Lead Software Engineer will manage software development processes, enhance application performance, mentor engineers, and ensure reliable software solutions for SpaceX's build operations.
Top Skills:
.NetAngularC#GoJavaPostgresPythonReactSQL Server
Other • Social Impact
Design, develop, maintain, and scale machine learning infrastructure. Collaborate with teams to improve the reliability and performance of ML systems while supporting engineers and researchers.
Top Skills:
AnsibleArgo CdDockerElk StackGpu AccelerationGrafanaHelmKubernetesMachine LearningPrometheusPythonPyTorchScikit-LearnTensorFlowTerraform
Healthtech
This role ensures the reliability and performance of cloud-native platforms, focusing on system design, incident response, and collaboration with various teams.
Top Skills:
AWSBashCi/CdCloudwatchDatadogGoHelmKubernetesPrometheusPythonTerraform
Cloud • Security • Software • Generative AI
The Site Reliability Engineer I will automate engineering efforts, improve platform reliability, and ensure customer satisfaction while managing cloud infrastructure and responding to incidents.
Top Skills:
DockerElastic StackGoGraphiteInfluxKubernetesLinuxPrometheusTerraform
Real Estate • Travel • PropTech
The Senior Staff Software Engineer will drive the development of a reliability strategy, enhance infrastructure performance, and mentor SRE teams.
Top Skills:
Cloud PlatformsHigh-Availability SystemsIncident Management ProcessesSoftware Engineering Practices
Top Los Angeles Companies Hiring Reliability Engineers
See AllPopular Job Searches
All Filters
Total selected ()
No Results
No Results