The Performance Optimization team at Luma is dedicated to maximizing the efficiency and performance of our AI models. Working closely with both research and engineering teams, this group ensures that our cutting-edge multimodal models can be trained efficiently and deployed at scale while maintaining the highest quality standards.
ResponsibilitiesProfile and optimize GPU/CPU/Accelerator code for maximum utilization and minimal latency
Write high-performance PyTorch, Triton, CUDA, deferring to custom PyTorch operations if necessary
Develop fused kernels and leverage tensor cores and modern hardware features for optimal hardware utilization on different hardware platforms
Optimize model architectures and implementations for distributed multi-node production deployment
Build performance monitoring and analysis tools and automation
Research and implement cutting-edge optimization techniques for transformer model
Expert-level proficiency in Triton/CUDA programming and GPU optimization
Strong PyTorch skills
Experience with PyTorch kernel development and custom operations
Proficiency with profiling tools (NVIDIA Nsight, torch profiler, custom tooling)
Deep understanding of transformer architectures and attention mechanisms
(Preferred) Experience with compilers/exporters such as torch.compile, TensorRT, ONNX, XLA
(Preferred) Experience optimizing inference workloads for latency and throughput
(Preferred) Experience with Triton compiler and kernel fusion techniques
(Preferred) Knowledge of warp-level intrinsics and advanced CUDA optimization
(Preferred) Background in compiler optimization or hardware-software co-design
Compensation
The pay range for this position in California is $180,000 - $250,000yr; however, base pay offered may vary depending on job-related knowledge, skills, candidate location, and experience. We also offer competitive equity packages in the form of stock options and a comprehensive benefits plan.
Your applications are reviewed by real people.
Top Skills
Similar Jobs
What you need to know about the Los Angeles Tech Scene
Key Facts About Los Angeles Tech
- Number of Tech Workers: 375,800; 5.5% of overall workforce (2024 CompTIA survey)
- Major Tech Employers: Snap, Netflix, SpaceX, Disney, Google
- Key Industries: Artificial intelligence, adtech, media, software, game development
- Funding Landscape: $11.6 billion in venture capital funding in 2024 (Pitchbook)
- Notable Investors: Strong Ventures, Fifth Wall, Upfront Ventures, Mucker Capital, Kittyhawk Ventures
- Research Centers and Universities: California Institute of Technology, UCLA, University of Southern California, UC Irvine, Pepperdine, California Institute for Immunology and Immunotherapy, Center for Quantum Science and Engineering