Snowflake Logo

Snowflake

AI System Research and Development Engineer - Frameworks

Posted 17 Days Ago
2 Locations
Mid level
2 Locations
Mid level
Design and maintain infrastructure for large language models, optimize resources, implement security measures, and mentor junior team members.
The summary above was generated by AI

Build the future of the AI Data Cloud. Join the Snowflake team.

We are looking for talented System Developers and Researchers to join the Snowflake AI Research team and contribute to LLM inference and training system development, optimizations, and agentic systems. Our mission is to build the most efficient and scalable generative AI systems.

Recent releases from our team include SwiftKV, an advanced inference optimization, and Arctic LLM, one of the largest open-source MoE foundation models. This is an exciting opportunity to collaborate with a world-class team, including founding members of DeepSpeed, vLLM, and TensorFlow. Together, we will push the boundaries of deep learning systems and drive cutting-edge innovations in AI.

Responsibilities:

  • Solve large-scale challenges in data preprocessing, model training, and model evaluation.

  • Develop and deploy state of the art tooling and open-source technologies to enhance the efficiency and effectiveness of AI solutions.

  • Apply advanced optimization techniques to reduce resource requirements while maintaining model performance and ensuring usability for researchers, developers and customers.

  • Stay updated with the latest advancements in LLM training and inference optimizations.

  • Open-source and publish innovations, optimizations, and engineering practices in technical blogs, top-tier conferences and journals.

Requirements:

  • 5 or more years of experience in deep learning frameworks, distributed systems, or high-performance computing (HPC).

  • Bachelor’s degree in Computer Science, Electrical Engineering, or a related field. A Master’s degree or PhD is preferred.

  • Expertise in distributed training frameworks (e.g., DeepSpeed, PyTorch DDP, FSDP, Megatron-LM).

  • Strong understanding of modern parallelism techniques such as data, tensor, sequence, ZeRO-based parallelism.

  • Programming language proficiency in Python and C++ or CUDA.

  • Solid problem-solving skills and ability to debug complex performance issues.

  • Excellent communication skills and ability to work effectively in a cross-functional team environment.

Join us in optimizing deep learning systems and pushing the boundaries of AI efficiency. Apply now to be part of our dynamic and pioneering team!

Snowflake is growing fast, and we’re scaling our team to help enable and accelerate our growth. We are looking for people who share our values, challenge ordinary thinking, and push the pace of innovation while building a future for themselves and Snowflake.

How do you want to make your impact?

For jobs located in the United States, please visit the job posting on the Snowflake Careers Site for salary and benefits information: careers.snowflake.com

Top Skills

AWS
Azure
C++
Docker
Elasticsearch
GCP
Go
Hadoop
Java
Kubernetes
Python
Spark

Similar Jobs

3 Hours Ago
Hybrid
Kirkland, WA, USA
60K-120K
Senior level
60K-120K
Senior level
Consumer Web • eCommerce • Information Technology • Retail • Software • Analytics • App development
This role involves translating business requirements into software solutions, guiding development teams, and automating processes to enhance efficiency.
Top Skills: Database TechnologiesDevOpsFrontend TechnologiesMiddleware
3 Hours Ago
2 Locations
190K-215K
Senior level
190K-215K
Senior level
Cloud • Information Technology • Machine Learning
As a Solutions Architect, you will lead customer engagement, prototype and deploy cloud solutions, provide technical leadership, and collaborate with engineering teams to enhance product offerings.
Top Skills: Distributed TrainingInferenceInfinibandKubernetesMachine Learning OperationsNetworkingNvidia Collective Communications LibraryNvidia GpusSlurm
8 Hours Ago
Hybrid
4 Locations
133K-235K Annually
Mid level
133K-235K Annually
Mid level
Artificial Intelligence • Cloud • Machine Learning • Mobile • Software • Virtual Reality • App development
Design and optimize ML infrastructure systems, build feature generation pipelines, develop high-performance inference systems, and manage scalable data for machine learning models.
Top Skills: C++Caffe2FlinkJavaPythonPyTorchRayScalaScikit-LearnSparkSpark MlTensorFlow

What you need to know about the Los Angeles Tech Scene

Los Angeles is a global leader in entertainment, so it’s no surprise that many of the biggest players in streaming, digital media and game development call the city home. But the city boasts plenty of non-entertainment innovation as well, with tech companies spanning verticals like AI, fintech, e-commerce and biotech. With major universities like Caltech, UCLA, USC and the nearby UC Irvine, the city has a steady supply of top-flight tech and engineering talent — not counting the graduates flocking to Los Angeles from across the world to enjoy its beaches, culture and year-round temperate climate.

Key Facts About Los Angeles Tech

  • Number of Tech Workers: 375,800; 5.5% of overall workforce (2024 CompTIA survey)
  • Major Tech Employers: Snap, Netflix, SpaceX, Disney, Google
  • Key Industries: Artificial intelligence, adtech, media, software, game development
  • Funding Landscape: $11.6 billion in venture capital funding in 2024 (Pitchbook)
  • Notable Investors: Strong Ventures, Fifth Wall, Upfront Ventures, Mucker Capital, Kittyhawk Ventures
  • Research Centers and Universities: California Institute of Technology, UCLA, University of Southern California, UC Irvine, Pepperdine, California Institute for Immunology and Immunotherapy, Center for Quantum Science and Engineering

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account