Uniphore Jobs

Principal Site Reliability Engineer

Uniphore

Principal Site Reliability Engineer

Posted Yesterday

Be an Early Applicant

In-Office

Palo Alto, CA

233K-336K Annually

Expert/Leader

In-Office

Palo Alto, CA

233K-336K Annually

Expert/Leader

Lead platform reliability and automation at scale by building production Go services, Kubernetes operators, multi-cloud infrastructure, and self-service tooling. Provide technical leadership through architecture, code, on-call escalation ownership, incident remediation, and mentorship to elevate engineering teams' operational maturity.

The summary above was generated by AI

Uniphore is one of the largest B2B AI-native companies—decades-proven, built-for-scale and designed for the enterprise. The company drives business outcomes, across multiple industry verticals, and enables the largest global deployments.

Uniphore infuses AI into every part of the enterprise that impacts the customer. We deliver the only multimodal architecture centered on customers that combines Generative AI, Knowledge AI, Emotion AI, workflow automation and a co-pilot to guide you. We understand better than anyone how to capture voice, video and text and how to analyze all types of data.

As AI becomes more powerful, every part of the enterprise that impacts the customer will be disrupted. We believe the future will run on the connective tissue between people, machines and data: all in the service of creating the most human processes and experiences for customers and employees.

Job Description:

What You'll Be a Part Of:
Uniphore is one of the largest B2B AI-native companies—decades-proven, built-for-scale and designed for the enterprise. The company drives business outcomes across multiple industry verticals and enables the largest global deployments. Uniphore infuses AI into every part of the enterprise that impacts the customer through our multimodal architecture combining Generative AI, Knowledge AI, Emotion AI, workflow automation and co-pilot guidance.

About the Role:
We're looking for a Principal Site Reliability Engineer to join our Platform Engineering team — someone equally at home writing production Go as designing and operating cloud infrastructure. The highest-leverage work here isn't a runbook; it's the service that enforces the runbook automatically. You'll write Go that runs in production and multiplies your impact across hundreds of services.

You'll build the standards, frameworks, automations, agentic workflows, and self-service capabilities that make engineering teams autonomous while maintaining enterprise-grade reliability and security. You won't just define standards — you'll implement them in code: a Kubernetes Operator that enforces service readiness criteria, a service that surfaces SLO health across the fleet, an internal platform service that automates task execution.

You'll collaborate with feature teams as an expert advisor and standard-setter, helping them build operational maturity while you maintain oversight of our single/multi-tenant, multi-cloud infrastructure. You'll be a bridge between software development and systems operations, focused on large-scale, resilient, automated infrastructure rather than daily firefighting.

This is a senior individual-contributor role. You will not have direct reports. Your leadership is technical — exercised through architecture, production code, design reviews, and mentorship. This role participates in our on-call rotation, which covers all production systems. As a Principal, you'll own the hardest escalations and use what you learn on-call to drive the architectural fixes that eliminate whole classes of incidents.

Responsibilities:

Invention:

Define and execute long-term architectural strategy for our multi-cloud platforms.

Lead hands-on implementation of critical infrastructure projects, focusing on reliability, automation, and performance at scale.

Own multi-year technical roadmaps that establish the vision for infrastructure scalability, reliability, security, and engineering velocity.

Own:

Provide technical leadership through design reviews and code contributions; set technical direction, eliminate architectural barriers to execution, and drive toward simplicity.

Maintain end-to-end technical stewardship of your systems, keeping execution aligned with architectural vision and best practices.

Act as a key technical advisor to Engineering Leadership and Product Management, influencing the strategic direction of Uniphore.

Lead design reviews across Infrastructure with a focus on automation, scalability, and reliability, and align architectural roadmaps across teams.

Partner with Security to build secure-by-default systems and remediate weaknesses.

Own the reliability of the systems under your technical stewardship.

Create the technical clarity — vision, standards, and tooling — that lets feature teams build, own and operate their services.

Participate in fleet-wide on-call, owning critical escalations across all production systems and converting recurring failure modes into permanent architectural fixes.

Teach:

Establish and evangelize design principles for reliable, secure, scalable systems.

Grow other engineers through technical mentorship, architectural guidance, and design review.

Requirements:

10+ years in DevOps/SRE/Platform Engineering, with demonstrated Staff- or Principal-scope impact and a track record of transforming operational models.

Production Go: you write Go regularly, understand its concurrency model, and are comfortable owning Go services in production.

Kubernetes depth: operational expertise plus the ability to extend it — you understand the controller-runtime model and could write or maintain a Kubernetes Operator.

Cloud & infrastructure: expert-level AWS/GCP/Azure, Terraform, and multi-cloud architecture, with strong cost-optimization instincts.

Production excellence: deep incident management, RCA process, and on-call system design experience.

Software engineering fundamentals: API design, testing, observability instrumentation, and service lifecycle ownership — you treat internal tooling with the same rigor as customer-facing software.

Standards & documentation: strong technical writing; you create operational procedures teams can self-execute.

Architecture & planning: RFC/PRD review experience; you catch operational problems at design time.

Collaboration & coaching: you build team capability through tooling and knowledge transfer rather than doing the work for them.

Nice to Haves:

Building Kubernetes Operators, controllers, or admission webhooks (controller-runtime, kubebuilder).

Contributions to open-source infrastructure tooling.

AWS Solutions Architect Professional or equivalent GCP/Azure certifications.

Kubernetes certifications (CKA, CKAD, CKS).

Platform engineering, developer experience, or internal developer portals (Backstage, etc.).

GitOps patterns (ArgoCD, Flux) and policy-as-code tooling (OPA, Kyverno).

Why You'll Love This Role:

Your code is your leverage. Solutions you ship multiply across dozens of services and teams — you prevent entire classes of problems rather than patching instances.

You'll shape the platform strategy. You'll drive the transformation from reactive support to strategic platform partnership, with platform engineering embedded in planning to prevent downstream issues.

You'll tackle the hardest problems. Multi-tenant architecture scaling, cross-service observability, and reliability challenges that affect our largest enterprise deployments.

You'll set the bar. Define the standards, incident-management frameworks, and service-ownership model that let teams graduate to full operational independence.

Hiring Range: $232,900 – $335,811 OTE — for Primary Location Palo Alto, CA

Benefits:

In addition to competitive base pay, this position also includes an annual incentive opportunity based on target achievement, pre-IPO stock options, benefits including medical, dental, vision, 401(k) with a match, and more, plus generous paid time off, paid holidays, paid day off for your birthday and other paid leave policies to support employees through all phases of life.

Location preference:

USA - CA - Palo Alto

Uniphore is an equal opportunity employer committed to diversity in the workplace. We evaluate qualified applicants without regard to race, color, religion, sex, sexual orientation, disability, veteran status, and other protected characteristics.

For more information on how Uniphore uses AI to unify—and humanize—every enterprise experience, please visit www.uniphore.com.

Similar Jobs

Zscaler

Site Reliability Engineer

17 Days Ago

Easy Apply

Remote or Hybrid

Easy Apply

193K-275K Annually

Expert/Leader

193K-275K Annually

Expert/Leader

Cloud • Information Technology • Security • Software • Cybersecurity

The Principal Site Reliability Engineer leads infrastructure projects, mentors junior engineers, ensures system reliability, and oversees networking services and scalable solutions with a focus on CI/CD and IaC/CaC.

Top Skills: AnsibleCi/CdEnterprise LinuxFreebsdGitGoHashicorp VaultKubernetesLdapLinux HypervisorsOidcPythonTerraform

The Walt Disney Company

Site Reliability Engineer

22 Days Ago

In-Office

251K-336K Annually

Senior level

251K-336K Annually

Senior level

Digital Media • Gaming • News + Entertainment • Sports

As a Sr Principal Site Reliability Engineer, you will ensure maximum platform availability, lead incident response processes, drive automation, and collaborate across teams to optimize system performance and operational efficiency.

Top Skills: Automation ToolsCloud TechnologiesContent Delivery NetworksMedia Streaming TechnologiesMonitoring Tools

NVIDIA

Site Reliability Engineer

25 Days Ago

In-Office or Remote

248K-397K Annually

Expert/Leader

248K-397K Annually

Expert/Leader

Artificial Intelligence • Computer Vision • Hardware • Robotics • Metaverse

Design, implement, and support a large-scale Observability & Telemetry platform. Ensure reliability, monitor system health, and automate processes while engaging in incident response and postmortems.

Top Skills: DockerGoGrafanaKubernetesLinuxOpenstackOpentelemetryPerlPrometheusPythonRuby

What you need to know about the Los Angeles Tech Scene

Los Angeles is a global leader in entertainment, so it’s no surprise that many of the biggest players in streaming, digital media and game development call the city home. But the city boasts plenty of non-entertainment innovation as well, with tech companies spanning verticals like AI, fintech, e-commerce and biotech. With major universities like Caltech, UCLA, USC and the nearby UC Irvine, the city has a steady supply of top-flight tech and engineering talent — not counting the graduates flocking to Los Angeles from across the world to enjoy its beaches, culture and year-round temperate climate.

Key Facts About Los Angeles Tech

Number of Tech Workers: 375,800; 5.5% of overall workforce (2024 CompTIA survey)
Major Tech Employers: Snap, Netflix, SpaceX, Disney, Google
Key Industries: Artificial intelligence, adtech, media, software, game development
Funding Landscape: $11.6 billion in venture capital funding in 2024 (Pitchbook)
Notable Investors: Strong Ventures, Fifth Wall, Upfront Ventures, Mucker Capital, Kittyhawk Ventures
Research Centers and Universities: California Institute of Technology, UCLA, University of Southern California, UC Irvine, Pepperdine, California Institute for Immunology and Immunotherapy, Center for Quantum Science and Engineering