Responsible for improving Zoox's HPC infrastructure, optimizing cloud/storage systems for ML workloads, designing APIs, and investigating distributed paradigms.
Zoox is looking for a software engineer to work on our custom High-Performance Computing infrastructure and its supporting ecosystem of tools and services. This infrastructure is central to machine learning workflows across all Zoox software divisions, from data engineering to computer vision perception to simulation and more. You will take on a breadth of end-to-end responsibilities including distributed system design, algorithmic job scheduling, and adaptive cloud scaling in support of all of Zoox’s computational needs.
In this role, you will:
- Design and implement improvements to Zoox’s in-house, cutting-edge HPC infrastructure
- Design systems that optimize various storage technologies in the cloud and our own datacenter(s) for performance, reliability, and efficiency that power our diverse machine learning workloads
- Investigate new distributed system paradigms and technologies to meet Zoox’s ever growing computational and storage needs
- Create production-grade web service APIs, SDKs, and other tools to provide a world-class developer experience for all of Zoox’s software teams
Qualifications:
- Experience with high-performance object storage and filesystems
- Experience with distributed systems
- Proficiency with Python, Java, or other managed languages
- Bachelor's degree in computer science (or related field)
- Experience with cloud computing platforms such as AWS, GCP, or Azure
Bonus Qualification:
- Deep experience with AWS FSx for Lustre, open-source Lustre filesystem, or another ML-optimized filesystem
- Experience with workload management / job scheduling systems such as SLURM
- Knowledge of machine learning / artificial intelligence systems
About Zoox
Zoox is developing the first ground-up, fully autonomous vehicle fleet and the supporting ecosystem required to bring this technology to market. Sitting at the intersection of robotics, machine learning, and design, Zoox aims to provide the next generation of mobility-as-a-service in urban environments. We’re looking for top talent that shares our passion and wants to be part of a fast-moving and highly execution-oriented team.
Follow us on LinkedIn
Accommodations
If you need an accommodation to participate in the application or interview process please reach out to [email protected] or your assigned recruiter.
A Final Note:
You do not need to match every listed expectation to apply for this position. Here at Zoox, we know that diverse perspectives foster the innovation we need to be successful, and we are committed to building a team that encompasses a variety of backgrounds, experiences, and skills.
Top Skills
AWS
Azure
GCP
Java
Python
Slurm
Similar Jobs
Aerospace • Artificial Intelligence • Hardware • Robotics • Security • Software • Defense
The Site Reliability Engineer will architect and maintain cloud infrastructure, lead deployments, and promote system resilience and CI/CD practices in a defense tech environment.
Top Skills:
ArgocdAWSDockerGoHelmKubernetesPythonRustTerraform
Cloud • Information Technology • Machine Learning
Lead CoreWeave's Benchmarking & Performance team, overseeing benchmarking programs, performance evaluations, and cross-functional collaborations to ensure high standards and transparency in performance metrics.
Top Skills:
CudaCudnnGrafanaKubeflowKubernetesKueueMlperfNcclOpentelemetryPrometheusPyTorchSunkTensorrtTriton
eCommerce • Healthtech • Pet • Retail • Pharmaceutical
The role involves unifying Chewy's data systems, managing migrations, and implementing data governance while enabling self-service and AI experimentation for product teams.
Top Skills:
AirflowDbtSnowflake
What you need to know about the Los Angeles Tech Scene
Los Angeles is a global leader in entertainment, so it’s no surprise that many of the biggest players in streaming, digital media and game development call the city home. But the city boasts plenty of non-entertainment innovation as well, with tech companies spanning verticals like AI, fintech, e-commerce and biotech. With major universities like Caltech, UCLA, USC and the nearby UC Irvine, the city has a steady supply of top-flight tech and engineering talent — not counting the graduates flocking to Los Angeles from across the world to enjoy its beaches, culture and year-round temperate climate.
Key Facts About Los Angeles Tech
- Number of Tech Workers: 375,800; 5.5% of overall workforce (2024 CompTIA survey)
- Major Tech Employers: Snap, Netflix, SpaceX, Disney, Google
- Key Industries: Artificial intelligence, adtech, media, software, game development
- Funding Landscape: $11.6 billion in venture capital funding in 2024 (Pitchbook)
- Notable Investors: Strong Ventures, Fifth Wall, Upfront Ventures, Mucker Capital, Kittyhawk Ventures
- Research Centers and Universities: California Institute of Technology, UCLA, University of Southern California, UC Irvine, Pepperdine, California Institute for Immunology and Immunotherapy, Center for Quantum Science and Engineering