Together AI Logo

Together AI

Junior Technical Program Manager — Infrastructure Operations

Posted Yesterday
Be an Early Applicant
In-Office
San Francisco, CA
150K-175K Annually
Junior
In-Office
San Francisco, CA
150K-175K Annually
Junior
The Junior Technical Program Manager will manage the end-to-end node lifecycle, datacenter operations, improve workflows, and resolve GPU utilization issues.
The summary above was generated by AI
About the Role

Together AI runs one of the most demanding GPU fleets in the industry. Keeping that fleet healthy - every node online, every GPU performing, every datacenter transition running on schedule - is operationally complex and genuinely high-stakes. We're looking for a Junior TPM to own that operational reality.

This is not a coordination or status-reporting role. You will own the end-to-end node lifecycle - from the moment a node goes down through repair, return, and re-integration - and you'll drive the cross-functional work to close every gap as fast as possible. You'll manage datacenter bring-ups, hunt down GPU utilization loss, and build the processes and dashboards that make our fleet operations more visible and accountable over time.

The environment moves fast and doesn't always come with a clear playbook. Much of what you'll work on is genuinely novel - you'll be figuring things out alongside engineers who are building at the frontier. If that sounds like an obstacle, this isn't the right role. If it sounds like the best possible way to learn, keep reading.

Responsibilities
  • Own the end-to-end node lifecycle - from failure through repair, return, and re-integration — across provider ticketing, internal tooling, and the state machine that governs each stage
  • Drive node remediation to resolution with urgency, eliminating gaps in ownership at every handoff
  • Manage project timelines for new datacenter bring-ups, coordinating across internal teams and external providers to keep milestones on track
  • Identify and diagnose GPU utilization loss across the fleet, working with engineering leads to drive resolution
  • Build dashboards and tracking processes that make efficiency gaps visible and ensure they get closed
  • Continuously improve operational workflows through process improvements and lightweight automation
  • Develop and maintain relationships with external datacenter providers
Requirements
  • Some prior experience in a TPM role - we're open to candidates who came into TPM from engineering, ops, or another technical function, but you should have done the work in practice: owning programs end-to-end, driving cross-functional resolution, managing external dependencies
  • A technical background or demonstrated experience in a highly technical environment - you don't need to know GPUs on day one, but you need to be able to engage meaningfully with technical problems and earn credibility with infrastructure engineers
  • A genuine bias toward action - you see a problem and start moving, even when the path forward isn't fully clear
  • Resilience in a fast-paced, sometimes chaotic environment - you adapt quickly, stay effective under pressure, and don't wait for perfect conditions to make progress
  • Strong organizational instincts - you can manage multiple workstreams, track dependencies, and keep things moving without losing the thread
  • Ability to zoom out - you can be deep in the weeds on an operational problem while keeping the bigger picture in view
About Together AI

Together AI is a research-driven artificial intelligence company. We believe open and transparent AI systems will drive innovation and create the best outcomes for society, and together we are on a mission to significantly lower the cost of modern AI systems by co-designing software, hardware, algorithms, and models. We have contributed to leading open-source research, models, and datasets to advance the frontier of AI, and our team has been behind technological advancement such as FlashAttention, Hyena, FlexGen, and RedPajama. We invite you to join a passionate group of researchers in our journey in building the next generation AI infrastructure.

Compensation

We offer competitive compensation, startup equity, health insurance and other competitive benefits. The US base salary range for this full-time position is: $150,000 - $175,000 + equity + benefits. Our salary ranges are determined by location, level and role. Individual compensation will be determined by experience, skills, and job-related knowledge. This is a hybrid role based in the Bay Area.

Equal Opportunity

Together AI is an Equal Opportunity Employer and is proud to offer equal employment opportunity to everyone regardless of race, color, ancestry, religion, sex, national origin, sexual orientation, age, citizenship, marital status, disability, gender identity, veteran status, and more.

Please see our privacy policy at https://www.together.ai/privacy


Similar Jobs

4 Hours Ago
Remote or Hybrid
United States
18-27 Hourly
Junior
18-27 Hourly
Junior
Artificial Intelligence • Automotive • Greentech • Information Technology • Machine Learning • Software • Cybersecurity
The Customer Care Specialist II ensures best-in-class customer experience through inbound and outbound calls, documentation, service requests, and coordination of services while maintaining positive customer rapport.
Top Skills: Microsoft Office Suite
4 Hours Ago
Hybrid
21-31 Hourly
Senior level
21-31 Hourly
Senior level
Artificial Intelligence • Automotive • Greentech • Information Technology • Machine Learning • Software • Cybersecurity
Responsible for upselling to customers, managing renewals, educating clients, and assessing serviceability while collaborating with sales engineers.
Top Skills: Business SoftwareTelecommunications Products
4 Hours Ago
Remote or Hybrid
United States
22-33 Hourly
Mid level
22-33 Hourly
Mid level
Artificial Intelligence • Automotive • Greentech • Information Technology • Machine Learning • Software • Cybersecurity
The Sr Customer Care Specialist will advocate for clients, ensuring satisfaction through efficient issue resolution and task management in a fast-paced environment.
Top Skills: Client Management SystemsCommunication Tools

What you need to know about the Los Angeles Tech Scene

Los Angeles is a global leader in entertainment, so it’s no surprise that many of the biggest players in streaming, digital media and game development call the city home. But the city boasts plenty of non-entertainment innovation as well, with tech companies spanning verticals like AI, fintech, e-commerce and biotech. With major universities like Caltech, UCLA, USC and the nearby UC Irvine, the city has a steady supply of top-flight tech and engineering talent — not counting the graduates flocking to Los Angeles from across the world to enjoy its beaches, culture and year-round temperate climate.

Key Facts About Los Angeles Tech

  • Number of Tech Workers: 375,800; 5.5% of overall workforce (2024 CompTIA survey)
  • Major Tech Employers: Snap, Netflix, SpaceX, Disney, Google
  • Key Industries: Artificial intelligence, adtech, media, software, game development
  • Funding Landscape: $11.6 billion in venture capital funding in 2024 (Pitchbook)
  • Notable Investors: Strong Ventures, Fifth Wall, Upfront Ventures, Mucker Capital, Kittyhawk Ventures
  • Research Centers and Universities: California Institute of Technology, UCLA, University of Southern California, UC Irvine, Pepperdine, California Institute for Immunology and Immunotherapy, Center for Quantum Science and Engineering

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account