Wizard AI Logo

Wizard AI

Senior ML Ops Engineer

Posted 21 Days Ago
Easy Apply
Remote
Hiring Remotely in USA
200K-250K Annually
Senior level
Easy Apply
Remote
Hiring Remotely in USA
200K-250K Annually
Senior level
As a Senior MLOps Engineer, you will manage production ML systems, define lifecycle strategies, optimize ML pipelines, and collaborate with cross-functional teams to enhance ML operations.
The summary above was generated by AI
About Wizard AI

At Wizard AI, we’re building the top-performing AI Shopping Agent that delivers the best products from across the web with unmatched accuracy, quality, and trust. Our ML models power the core of our platform, and we’re seeking an experienced Senior MLOps Engineer to take ownership of how our machine learning systems run reliably and efficiently in production.

The Role

As a Senior MLOps Engineer at Wizard, you’ll own the end-to-end ML lifecycle – from model packaging and deployment to monitoring, observability, optimization and scaling – for a custom-built inference platform powering a live conversational shopping agent. This is not a standard cloud ML pipeline role; we run multiple specialized inference engines handling real-time inference for high-stakes shopping decisions, and the work requires both hands-on operational depth and the architectural judgement to evolve the platform as Wizard scales. You’ll work closely with ML Engineers, Data teams, and DevOps, with real influence over how the infrastructure is designed – not just how it runs.

What You’ll Do
  • Build, maintain, and optimize production-grade ML pipelines, enabling seamless transitions from experimentation to production.
  • Define and implement strategies for model versioning, rollout, rollback, and lifecycle management to ensure robust and reproducible ML systems
  • Define and enforce serving-layer SLAs – latency, availability, GPU utilization, TTFT, ITL – and build observability and alerting
  • Apply software engineering best practices including testing, CI/CD integration, and reproducibility to ML workflows, improving iteration speed for ML engineers without compromising reliability.
  • Ensure ML systems are secure, cost-efficient, and scalable, partnering with DevOps on infrastructure standards while owning ML-specific operational concerns.
  • Collaborate cross-functionally with ML, Data, Product, and DevOps teams to translate ML requirements into production-ready systems and influence technical planning and roadmap decisions.
What We’re Looking For
  • Bachelor’s or Master’s degree in Computer Science, Data Science, or a related field, or equivalent experience.
  • 5-8+ years of experience in Software Engineering, ML Engineering, Platform Engineering, or Infrastructure Engineering with direct ownership of production ML serving systems.
  • Hands-on experience deploying and maintaining LLMs and deep learning models, in production environments.
  • Strong Python skills and software engineering fundamentals with infrastructure depth. Familiarity with ML frameworks (PyTorch, Tensorflow or similar) is preferred.
  • Experience with cloud platforms such as AWS, GCP, or Azure, and familiarity with ML lifecycle tooling, including model registries and experimentation platforms.
  • Familiarity with inference optimization at the hardware and systems level – batching strategies, memory management, quantization tradeoffs, CPU/GPU interaction patterns.
  • Demonstrated ability to reason about tradeoffs between latency, cost, throughput, and reliability at the systems as well as operational level.
  • Experience in high-growth startup environments and an ability to thrive in a fast-paced, evolving technical landscape.

​​What Success Looks Like

  • Reliable, Scalable ML Systems: Production models run with clear SLAs, minimal downtime, and full observability – latency, availability, and GPU utilization tracked and enforced. Deployment pipelines handle growth and evolving AI requirements.
  • End-to-End Ownership: You own the full ML lifecycle – from packaging and deployment through monitoring and optimization – enabling ML engineers to iterate quickly while maintaining reproducibility, reliability and security.
  • Influence and Impact: You shape the technical roadmap for ML operations, collaborating with ML, Data, and DevOps teams to improve system performance, reduce operational costs, and drive the overall AI strategy forward
Compensation & Benefits

The expected base salary range for this role is $200,000 – $250,000 USD, and will vary based on skills, experience, role level, and geographic location. Final compensation will be determined by considering these factors alongside overall role scope and responsibilities.

In addition to base salary, Wizard offers:

  • Equity in the form of stock options
  • Medical, dental, and vision coverage
  • 401(k) plan
  • Flexible PTO and company holidays
  • Fully remote work within the United States
  • Periodic company offsites and team gatherings

Wizard is committed to fair, transparent, and competitive compensation practices.

Top Skills

AWS
Azure
Ci/Cd
GCP
Python
PyTorch
TensorFlow

Similar Jobs

Yesterday
Remote or Hybrid
157K-234K Annually
Mid level
157K-234K Annually
Mid level
Transportation
The role involves designing and implementing MLOps pipelines, optimizing machine learning processes, and ensuring model performance and compliance in a collaborative environment.
Top Skills: SparkAWSAzureDockerGitGCPHadoopKubernetesPythonPyTorchTensorFlowTerraform
21 Days Ago
In-Office or Remote
152K-228K Annually
Senior level
152K-228K Annually
Senior level
Healthtech • Biotech
The Sr. Engineer, Machine Learning Operations designs and operates ML solutions for cancer screening. They build robust pipelines and collaborate with cross-functional teams.
Top Skills: AWSAzureDockerGCPKubernetesPythonPyTorchScikit-LearnTensorFlow
11 Days Ago
Remote
United States
140K-150K Annually
Mid level
140K-150K Annually
Mid level
Digital Media • Sports
The Senior Machine Learning Operations Engineer will design, build, and maintain ML infrastructure, streamline model deployment, and monitor performance to ensure the success of machine learning models at The Athletic.
Top Skills: AirflowAWSAzureDockerGCPKubernetesPythonPyTorchScikit-LearnSQL

What you need to know about the Los Angeles Tech Scene

Los Angeles is a global leader in entertainment, so it’s no surprise that many of the biggest players in streaming, digital media and game development call the city home. But the city boasts plenty of non-entertainment innovation as well, with tech companies spanning verticals like AI, fintech, e-commerce and biotech. With major universities like Caltech, UCLA, USC and the nearby UC Irvine, the city has a steady supply of top-flight tech and engineering talent — not counting the graduates flocking to Los Angeles from across the world to enjoy its beaches, culture and year-round temperate climate.

Key Facts About Los Angeles Tech

  • Number of Tech Workers: 375,800; 5.5% of overall workforce (2024 CompTIA survey)
  • Major Tech Employers: Snap, Netflix, SpaceX, Disney, Google
  • Key Industries: Artificial intelligence, adtech, media, software, game development
  • Funding Landscape: $11.6 billion in venture capital funding in 2024 (Pitchbook)
  • Notable Investors: Strong Ventures, Fifth Wall, Upfront Ventures, Mucker Capital, Kittyhawk Ventures
  • Research Centers and Universities: California Institute of Technology, UCLA, University of Southern California, UC Irvine, Pepperdine, California Institute for Immunology and Immunotherapy, Center for Quantum Science and Engineering

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account