Graphcore Logo

Graphcore

At-Scale Hardware System Validation and BKC Test Lead

Posted 9 Hours Ago
Be an Early Applicant
Hybrid
Austin, TX
Expert/Leader
Hybrid
Austin, TX
Expert/Leader
Drive automation-based validation execution for hyperscale AI hardware platforms, define validation strategies, and ensure system readiness for production.
The summary above was generated by AI

About us 

Graphcore is one of the world’s leading innovators in Artificial Intelligence compute. It is developing hardware, software and systems infrastructure that will unlock the next generation of AI breakthroughs and power the widespread adoption of AI solutions across every industry. 

As part of the SoftBank Group, Graphcore is a member of an elite family of companies responsible for some of the world’s most transformative technologies. Together, they share a bold vision: to enable Artificial Super Intelligence and ensure its benefits are accessible to everyone. 

Graphcore’s teams are drawn from diverse backgrounds and bring a broad range of skills and perspectives. A melting pot of AI research specialists, silicon designers, software engineers and systems architects, Graphcore enjoys a culture of continuous learning and constant innovation. 

Job Summary 

We are seeking an experienced At-Scale Hardware System Validation and BKC Test Lead to drive automation-based validation execution for hyperscale AI hardware platforms. This role focuses on delivering large-scale system validation across blade-level and rack-level AI infrastructure. 

The successful candidate will define and execute automation-driven validation strategies to ensure robust hardware, firmware, and system integration. The role is responsible for delivering Best Known Configuration (BKC) for AI server platforms and ensuring readiness for hyperscale data center deployments. 

The Team 

The Systems Validation team ensures Graphcore’s AI hardware platforms are validated at scale across blade-level, rack-level, and data center environments. 

The team collaborates closely with silicon enablement, system architecture, firmware, automation infrastructure, rack integration, and operations teams to deliver reliable and scalable AI infrastructure platforms. 

Responsibilities and Duties 

  • Own the end-to-end validation strategy for at-scale test execution across AI hardware platforms. 
  • Ensure comprehensive validation coverage across blade-level systems and rack-level infrastructure including power, cooling, networking, and thermal subsystems. 
  • Act as the primary technical liaison with automation teams to integrate validation infrastructure and execution environments. 
  • Drive validation plans that deliver qualified Best Known Configuration (BKC) for hardware and firmware solutions. 
  • Define BKC validation criteria across system components including CPU, GPU, DDR, PCIe, storage, networking, and system management controllers. 
  • Lead debug and failure analysis across internal engineering teams and ODM partners. 
  • Develop validation dashboards, coverage metrics, and reporting frameworks for engineering and leadership visibility. 
  • Partner with architecture, silicon enablement, firmware, rack integration, and operations teams to ensure system readiness for production. 
  • Support collaboration with ODM partners to ensure effective validation execution and issue resolution. 

Candidate Profile 

Essential 

  • Bachelor’s or Master’s degree in Electrical Engineering, Computer Engineering, or related discipline. 
  • 15+ years of experience in server hardware validation or system engineering. 
  • Experience designing or implementing automation infrastructure for at-scale validation execution. 
  • Proven experience validating blade-level and rack-level server platforms in hyperscale environments. 
  • Experience validating integrated HW/FW/SW server solutions across the product lifecycle. 
  • Strong knowledge of high-speed interfaces such as PCIe, CXL, DDR, NVLink, and Ethernet. 
  • Experience working with system firmware including UEFI, BMC firmware, and rack management solutions. 
  • Demonstrated success leading complex hardware debug and failure analysis across cross-functional teams. 

Desirable 

  • Experience with ARM-based or x86 server architectures. 
  • Background in rack integration validation and hyperscale data center deployments. 
  • Experience building automation-driven validation frameworks and test analytics systems. 
  • Strong leadership and program coordination skills across complex engineering programs. 

Top Skills

Automation Infrastructure
Bmc Firmware
Cxl
Ddr
Ethernet
Nvlink
Pcie
Uefi

Similar Jobs at Graphcore

An Hour Ago
Hybrid
Austin, TX, USA
Senior level
Senior level
Artificial Intelligence • Semiconductor
Lead validation and quality assurance for firmware stacks on ARM-based servers, including security, functionality, and reliability testing.
Top Skills: ArmEdk IiGdbGpioI2CI3CIpmiJtagLogic AnalyzersMctpOpenbmcPciePldmProtocol AnalyzersRedfishSmbusSpiUartUefi
5 Hours Ago
Hybrid
2 Locations
Expert/Leader
Expert/Leader
Artificial Intelligence • Semiconductor
Lead the architecture and development of OpenBMC firmware for AI server platforms, enabling hardware integration, developing security capabilities, and collaborating with teams for reliable firmware delivery.
Top Skills: BashBitbakeCC++Ci/CdD-BusGdbI3CI²CJtagMctpOpenbmcPciePldmPythonRedfishSpiYocto
5 Hours Ago
Hybrid
2 Locations
Senior level
Senior level
Artificial Intelligence • Semiconductor
Lead architecture and development of OpenBMC firmware for AI infrastructure, collaborating with partners on reliability, scalability, and serviceability.
Top Skills: BashCC++Ci/CdDcmiI2CI3CIpmiLinuxMctpNc-SiOpenbmcPciePldmPmciPythonRedfishSgpioSpiUartUsbYocto

What you need to know about the Los Angeles Tech Scene

Los Angeles is a global leader in entertainment, so it’s no surprise that many of the biggest players in streaming, digital media and game development call the city home. But the city boasts plenty of non-entertainment innovation as well, with tech companies spanning verticals like AI, fintech, e-commerce and biotech. With major universities like Caltech, UCLA, USC and the nearby UC Irvine, the city has a steady supply of top-flight tech and engineering talent — not counting the graduates flocking to Los Angeles from across the world to enjoy its beaches, culture and year-round temperate climate.

Key Facts About Los Angeles Tech

  • Number of Tech Workers: 375,800; 5.5% of overall workforce (2024 CompTIA survey)
  • Major Tech Employers: Snap, Netflix, SpaceX, Disney, Google
  • Key Industries: Artificial intelligence, adtech, media, software, game development
  • Funding Landscape: $11.6 billion in venture capital funding in 2024 (Pitchbook)
  • Notable Investors: Strong Ventures, Fifth Wall, Upfront Ventures, Mucker Capital, Kittyhawk Ventures
  • Research Centers and Universities: California Institute of Technology, UCLA, University of Southern California, UC Irvine, Pepperdine, California Institute for Immunology and Immunotherapy, Center for Quantum Science and Engineering

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account