NVIDIA Logo

NVIDIA

Solutions Architect - Cloud Infrastructure

Posted 5 Days Ago
In-Office or Remote
5 Locations
120K-236K
Mid level
In-Office or Remote
5 Locations
120K-236K
Mid level
As a Cloud Solution Architect, you will architect and deploy resilient AI compute environments, collaborate with engineering for design wins, and act as a client advisor.
The summary above was generated by AI

We are excited to announce an opening for a Cloud Solution Architect at NVIDIA and are seeking a passionate individual with a strong interest in large-scale GPU infrastructure and AI Factory deployments! If you are enthusiastic about contributing to projects that push the boundaries of cloud-based AI and resilience in large-scale environments, we invite you to read on. NVIDIA is renowned as one of the most sought-after employers in the technology world, offering highly competitive benefits. We are home to some of the most innovative and forward-thinking individuals globally. If you are creative, autonomous, and eager to apply your skills and knowledge in a dynamic environment, we want to hear from you!

What you'll be doing:

  • Working as a key member of our cloud solutions team, you will be the go-to technical expert on NVIDIA AI Factory solutions and large-scale GPU infrastructure, helping clients architect and deploy resilient, telemetry-driven AI compute environments at unprecedented scale.

  • Collaborating directly with engineering teams to secure design wins, address challenges, and deploy solutions into production, with a focus on developing robust tooling for observability, failure recovery, and infrastructure-level performance optimization.

  • Acting as a trusted advisor to our clients, understanding their cloud environment, translating requirements into technical solutions, and providing guidance on optimizing NVIDIA AI Factories for scalable, reliable, and high-performance workloads.

What we need to see:

  • 2+ years of experience in large-scale cloud infrastructure engineering, distributed AI/ML systems, or GPU cluster deployment and management.

  • A BS in Computer Science, Electrical Engineering, Mathematics, or Physics, or equivalent experience.

  • Proven understanding of large-scale computing systems architecture, including multi-node GPU clusters, high-performance networking, and distributed storage.

  • Experience with infrastructure-as-code, automation, and configuration management for large-scale deployments.

  • A passion for machine learning and AI, and the drive to continually learn and apply new technologies.

  • Excellent interpersonal skills, including the ability to explain complex technical topics to non-experts.

Ways to stand out from the crowd:

  • Expertise with orchestration and workload management tools like Slurm, Kubernetes, Run:ai, or similar platforms for GPU resource scheduling.

  • Knowledge of AI training and inference performance optimization at scale, including distributed training frameworks and multi-node communication patterns.

  • Hands-on experience designing telemetry systems and failure recovery mechanisms for large-scale cloud infrastructures including observability tools such as Grafana, Prometheus, and OpenTelemetry.

  • Proficiency in deploying and managing cloud-native solutions using platforms such as AWS, Azure, or Google Cloud, with a focus on GPU-accelerated workloads.

  • Deep expertise with high-performance networking technologies, particularly NVIDIA InfiniBand, NCCL, and GPU-Direct RDMA for large-scale AI workloads.

Your base salary will be determined based on your location, experience, and the pay of employees in similar positions. The base salary range is 120,000 USD - 189,750 USD for Level 2, and 148,000 USD - 235,750 USD for Level 3.

You will also be eligible for equity and benefits.

Applications for this job will be accepted at least until October 11, 2025.NVIDIA is committed to fostering a diverse work environment and proud to be an equal opportunity employer. As we highly value diversity in our current and future employees, we do not discriminate (including in our hiring and promotion practices) on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law.

Top Skills

Ai Factory
Automation
AWS
Azure
GCP
Gpu Infrastructure
Gpu-Direct Rdma
Grafana
Infrastructure-As-Code
Kubernetes
Nccl
Nvidia Infiniband
Opentelemetry
Prometheus
Slurm

Similar Jobs

49 Minutes Ago
Easy Apply
Remote or Hybrid
2 Locations
Easy Apply
Senior level
Senior level
Healthtech • Software • Telehealth
The Payroll Lead oversees payroll compliance, streamlines payroll processes, collaborates with cross-functional teams, resolves discrepancies, and ensures data integrity for employees across the U.S.
Top Skills: Adp WfnExcelGoogle Sheets
An Hour Ago
Remote or Hybrid
2 Locations
Internship
Internship
Fintech • Insurance • Payments • Software
The Legal Intern will conduct legal research, assist in contract review, support compliance efforts, and collaborate with various teams to ensure regulatory adherence.
An Hour Ago
Easy Apply
Remote
United States
Easy Apply
80K-105K
Mid level
80K-105K
Mid level
Healthtech • Software
The Business Intelligence Engineer will develop BI reports and dashboards, ensuring data quality and collaborating with business stakeholders for data-driven insights.
Top Skills: Amazon AwsSQLTableau

What you need to know about the Los Angeles Tech Scene

Los Angeles is a global leader in entertainment, so it’s no surprise that many of the biggest players in streaming, digital media and game development call the city home. But the city boasts plenty of non-entertainment innovation as well, with tech companies spanning verticals like AI, fintech, e-commerce and biotech. With major universities like Caltech, UCLA, USC and the nearby UC Irvine, the city has a steady supply of top-flight tech and engineering talent — not counting the graduates flocking to Los Angeles from across the world to enjoy its beaches, culture and year-round temperate climate.

Key Facts About Los Angeles Tech

  • Number of Tech Workers: 375,800; 5.5% of overall workforce (2024 CompTIA survey)
  • Major Tech Employers: Snap, Netflix, SpaceX, Disney, Google
  • Key Industries: Artificial intelligence, adtech, media, software, game development
  • Funding Landscape: $11.6 billion in venture capital funding in 2024 (Pitchbook)
  • Notable Investors: Strong Ventures, Fifth Wall, Upfront Ventures, Mucker Capital, Kittyhawk Ventures
  • Research Centers and Universities: California Institute of Technology, UCLA, University of Southern California, UC Irvine, Pepperdine, California Institute for Immunology and Immunotherapy, Center for Quantum Science and Engineering

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account