Top Reliability Engineer Jobs in Los Angeles
The Reliability Engineer at Anduril Industries will work within the organization to support product development, define and execute processes for continuous improvement and risk mitigation, and lead root cause + corrective action efforts throughout the new product development life cycle. They will also support field performance monitoring, track trends/lessons learned, and recommend improvements to upstream teams.
The Software Reliability Engineer at Anduril Industries will be responsible for integrating electronic warfare system functionality into the software architecture, developing software interfaces, and ensuring timely delivery of mission-critical code. Responsibilities include writing code, collaborating across teams, analyzing metrics, triaging issues, and partnering with end-users. Travel up to 30% of the time is required for real-world deployments.
Responsible for ensuring the reliability of rockets through analysis, architectural decision making, and implementing technical and programmatic solutions. Conduct root cause investigations and develop processes to assess manufacturing issues and manage technical risk.
You will work across all engineering disciplines to tackle challenging issues faced by Relativity in design, build, test, and launch operations. Responsibilities include performing independent assessments of subsystem and system designs, driving architectural decision-making, leading root cause investigations, and developing processes to assess manufacturing issues.
As a Senior Site Reliability Engineer at Luxury Presence, you will lead strategic initiatives to streamline platform operations, enhance Kubernetes clusters, automate application deployments, improve monitoring, and ensure system performance. Your role will focus on designing and maintaining scalable infrastructure while fostering collaboration and knowledge sharing within the Engineering team.
As a Senior Site Reliability Engineer at Luxury Presence, you will lead strategic initiatives to streamline platform operations, optimize performance, and improve system health. Responsibilities include designing scalable infrastructure, managing Kubernetes clusters, automating deployments, enhancing monitoring and observability, resolving incidents, and ensuring security best practices.
As a Senior Site Reliability Engineer at Luxury Presence, you will lead strategic initiatives to streamline platform operations, design scalable infrastructure, optimize performance, automate deployments, enhance monitoring, and ensure security best practices.
Looking for a Sr. Engineer to drive ongoing improvements to automation and service offerings, maintaining Artifact Management implementations in a hybrid environment at a global scale. Responsibilities include designing and maintaining Artifact management system, developing best practices, architecting cloud agnostic solutions, building automation, supporting CI/CD build tools, monitoring system health, and improving service reliability. Required skills in deploying CI/CD tools, Artifact repository services, IaC provisioning tools, and Source Code Management services. Must have on-premise and cloud expertise, experience with Kubernetes, and a security mindset.
Featured Jobs
As a Federal Site Reliability Engineer (SRE) for ServiceNow, you will provide 24x7 support for the Government Cloud infrastructure during the 3rd Shift. Responsibilities include driving technical resolutions across the technology stack, promoting operability to reduce incidents, and improving services for customers. Strong expertise in DevOps, Linux systems, software development, Observability, Monitoring, and Cloud technologies like Azure and AWS is required.
Seeking a Sr Site Reliability Engineer with 10+ years of IT experience to address gaps, improve engineering services, and automate manual processes for increased reliability and reduced costs. Responsibilities include monitoring production environments, tracking defects, implementing new technologies, and providing tier-three support. Bachelor's degree or equivalent work experience required.
ServiceNow is seeking a Site Reliability Engineer for their Federal SRE Team in a 2nd Shift position. Responsibilities include maintaining reliability, scalability, and performance of infrastructure, driving technical resolutions, and enhancing operability. Requires expertise in DevOps, Automation, Scripting, Linux systems, software development, Observability, Monitoring, and Cloud technologies (Azure, AWS).
Install, connect, and maintain Anduril's software to deliver mission-critical capabilities. Serve as a bridge between product engineering and deployed operations. Practice extreme ownership of dependencies to ensure total mission success. Work on sophisticated configuration, reliability engineering, deployment at scale, incident triage, software tooling, and more. Support end-to-end customer success and participate in customer demonstrations and early deployments.
Support the production stability of customer application(s) and infrastructure services with a focus on database technologies, performance tuning, and troubleshooting. Must have a passion for complex system issues under real-world loads and be able to work in a fast-paced environment. Responsibilities include becoming an SME in production triage, learning the platform end-to-end, identifying bugs, solving performance issues, and developing automation tools.
The Site Reliability Engineer at ServiceNow is responsible for maintaining and developing the reliability, scalability, and performance of the infrastructure. This role involves driving technical resolutions across the technology stack and improving operability to enhance customer experience.
Take ownership of complex issues related to performance, reliability, and scalability, collaborate with engineering teams, provide technical leadership on reliability, improve monitoring and metrics, and guide a technical roadmap for reliability
The Site Reliability Engineer will be responsible for managing infrastructure related to the Direct-to-Consumer platforms, focusing on automation, databases, testing, observability, and resiliency. Responsibilities include ensuring performance, availability, and resilience of databases, deploying and testing database changes safely, troubleshooting issues, designing fault-tolerant systems, and providing database support to engineering teams.
The Site Reliability Engineer at ServiceNow is responsible for providing 24x7 production support for the Government Community Cloud infrastructure, focusing on driving technical resolutions and platform operability. The role involves combining expertise in software development, networking, and systems engineering to enhance services for customers.
The Staff Site Reliability Engineer will be responsible for leading the design and implementation of major software components, systems, and features to improve the availability, scalability, latency, and efficiency of Matillion's SaaS services. They will also drive the expansion of observability infrastructure and provide guidance and mentorship to other team members.
Site Reliability Engineer responsible for building scalable infrastructure, designing systems for software deployment, automating server provisioning, and ensuring reliability of Dropbox services. Requires BS in Computer Science, 2+ years of industry experience, and strong analytical and communication skills.
Maintain uptime of LogicMonitor's SaaS based service, design and deploy new infrastructures, automate infrastructure maintenance and deployments, support development, lead large and technically complex projects, drive operational and architecture/design changes.
Work closely with cross-functional teams to ensure operational data is managed effectively, participate in on-call rotation for 24/7/365 availability, build tools for system resiliency, triage site availability incidents, establish design patterns for monitoring and deploying new features, and improve infrastructure reliability and performance.
The Site Reliability Engineer at Convoso is responsible for managing and monitoring systems and infrastructure, automating deployments, and collaborating with professionals to ensure high-quality deliverables. This role requires experience with automation software, scripting, configuring Linux systems, and troubleshooting UNIX/Linux environments.
Top Los Angeles Companies Hiring Reliability Engineers
See AllAll Filters
No Results
No Results