ChipStack Logo

ChipStack

Head of Site Reliability Engineering at ChipStack

Posted Yesterday
Be an Early Applicant
In-Office
San Jose, CA
Senior level
In-Office
San Jose, CA
Senior level
In this role, you will design, automate, and operate low-latency hybrid and on-prem environments to ensure high availability for ChipStack's services while collaborating with product engineers and leading incident response efforts.
The summary above was generated by AI

Locations • San Jose, CA – On‑site • Full‑time • Engineering

About ChipStack

Chips power everything, yet chip‑design tooling hasn’t kept up with the exploding complexity. ChipStack reinvents verification with AI‑native software already in use at 10+ semiconductor innovators. Backed by Khosla Ventures, Cerberus, and Clear Ventures, our small, fast team ships at the intersection of AI, EDA, and systems engineering.

The Opportunity

We need rock‑solid, low‑latency deployments—often inside customer data centers with no internet egress. As our first dedicated reliability owner, you’ll design, automate and operate these hybrid/on‑prem environments so customers experience “five nines” availability without touching the underlying plumbing.

What You’ll Do
  • Own end‑to‑end reliability – architect, deploy, and monitor production clusters (on‑prem & cloud) running our Python/TypeScript micro‑services, LLM workloads and GPU back‑ends.

  • Automate the stack – build IaC pipelines (Terraform), GitOps workflows and zero‑downtime rollout strategies.

  • Observe & respond – instrument apps with Prometheus/Grafana, set SLOs/SLIs, lead incident response, perform root‑cause analysis, and harden runbooks.

  • Secure & comply – implement network segmentation, secrets management, RBAC and vulnerability scanning to satisfy strict semiconductor‑industry requirements.

  • Collaborate – pair with product engineers on performance profiling, scalability bottlenecks and customer issue triage.

  • Continually improve – champion best practices in testing, CI/CD, and chaos drills to push our “ship fast, ship quality” culture.

Must‑Have Skills
  • 5+ years building and operating production systems as an SRE / DevOps / Platform Engineer.

  • Hands‑on expertise with Kubernetes and Docker in hybrid or bare‑metal setups.

  • Strong Python for automation tooling; proficiency reading TypeScript services.

  • Deep Linux administration knowledge (kernel tuning, networking, storage, security hardening).

  • Proven track record delivering 99.9 %+ uptime for latency‑sensitive services.

  • Observability stack experience (Prometheus, Grafana, Loki / ELK, Alertmanager).

  • Proficiency with Terraform (or equivalent IaC) and Git‑based workflows.

  • Excellent communication and a bias for action when facing vague, first‑of‑its‑kind problems.

Nice‑to‑Have
  • Experience running GPU workloads, ML inference or EDA toolchains in production.

  • Familiarity with air‑gapped / restricted‑network deployments and data‑center operations.

  • Exposure to security certifications (SOC 2, ISO 27001) or semiconductor customer audits.

  • Prior work at an early‑stage startup.

Our Culture (What You’ll Thrive In)
  • Challenge status‑quoStrong opinions, loosely heldShip fast, ship qualityProud of our craft

Ready to harden the infrastructure that will redefine chip design? Apply now and keep ChipStack running flawlessly for the world’s most advanced silicon teams.

Top Skills

Alertmanager
Docker
Elk
Grafana
Kubernetes
Prometheus
Python
Terraform
Typescript

Similar Jobs

8 Minutes Ago
Hybrid
3 Locations
247K-392K Annually
Expert/Leader
247K-392K Annually
Expert/Leader
Cloud • Insurance • Professional Services • Analytics • Cybersecurity
The SVP, Regional Field Leader is responsible for regional profitability and production goals, engaging with business units, collaborating on sales initiatives, and managing key relationships.
Top Skills: Business-Related SoftwareMicrosoft Office Suite
9 Minutes Ago
Easy Apply
In-Office
Fresno, CA, USA
Easy Apply
75-95 Annually
Junior
75-95 Annually
Junior
Edtech • Fintech • Sports
As an Account Executive, you will drive sales performance by executing strategic plans, developing business opportunities, and maintaining customer relationships.
Top Skills: Hubspot
10 Minutes Ago
In-Office or Remote
San Francisco, CA, USA
146K-229K Annually
Senior level
146K-229K Annually
Senior level
Cloud • Information Technology • Productivity • Security • Software • App development • Automation
The Senior Revenue Enablement Readiness Manager leads the enablement strategy for Strategic Sales, collaborating with stakeholders to align priorities and enhance sales impact through training and resources.

What you need to know about the Los Angeles Tech Scene

Los Angeles is a global leader in entertainment, so it’s no surprise that many of the biggest players in streaming, digital media and game development call the city home. But the city boasts plenty of non-entertainment innovation as well, with tech companies spanning verticals like AI, fintech, e-commerce and biotech. With major universities like Caltech, UCLA, USC and the nearby UC Irvine, the city has a steady supply of top-flight tech and engineering talent — not counting the graduates flocking to Los Angeles from across the world to enjoy its beaches, culture and year-round temperate climate.

Key Facts About Los Angeles Tech

  • Number of Tech Workers: 375,800; 5.5% of overall workforce (2024 CompTIA survey)
  • Major Tech Employers: Snap, Netflix, SpaceX, Disney, Google
  • Key Industries: Artificial intelligence, adtech, media, software, game development
  • Funding Landscape: $11.6 billion in venture capital funding in 2024 (Pitchbook)
  • Notable Investors: Strong Ventures, Fifth Wall, Upfront Ventures, Mucker Capital, Kittyhawk Ventures
  • Research Centers and Universities: California Institute of Technology, UCLA, University of Southern California, UC Irvine, Pepperdine, California Institute for Immunology and Immunotherapy, Center for Quantum Science and Engineering

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account