CrowdStrike Logo

CrowdStrike

Director, Model Post-Training and Agentic Research (Remote)

Posted An Hour Ago
Be an Early Applicant
Remote or Hybrid
Hiring Remotely in USA
195K-290K Annually
Senior level
Remote or Hybrid
Hiring Remotely in USA
195K-290K Annually
Senior level
Lead and hands-on develop the full post-training stack for security-domain AI, including SFT, RLHF/RLAIF, reward modeling, and agent-RL harnesses. Build training environments and agent scaffolds, define evaluation and benchmarks, drive research direction, publish findings, and recruit and grow a high-density research and engineering team while actively contributing to experiments and architecture.
The summary above was generated by AI

As a global leader in cybersecurity, CrowdStrike protects the people, processes and technologies that drive modern organizations. Since 2011, our mission hasn’t changed — we’re here to stop breaches, and we’ve redefined modern security with the world’s most advanced AI-native platform. Our customers span all industries, and they count on CrowdStrike to keep their businesses running, their communities safe and their lives moving forward. We’re also a mission-driven company. We cultivate a culture that gives every CrowdStriker both the flexibility and autonomy to own their careers. We’re always looking to add talented CrowdStrikers to the team who have limitless passion, a relentless focus on innovation and a fanatical commitment to our customers, our community and each other. Ready to join a mission that matters? The future of cybersecurity starts with you.

About the Role:

The security domain presents one of the richest and most consequential training signal environments in applied AI.  It’s adversarial by nature, grounded in real operational outcomes, and evolving faster than any static benchmark can capture. We're building the post-training and reinforcement learning capability to build the latest models and harnesses into security-specialized systems that reason, plan, and act across complex cyber workflows. The person leading this work will be in the research, not just directing it.

In this role, you'll own the full post-training stack for security-domain AI (e.g., supervised fine-tuning, reward modeling, RLHF and RLAIF pipelines, and agent-RL environments) and the agentic research that sits on top of it. That means designing, building, and evaluating the harnesses that security agents actually run on (e.g., the scaffolding, tool-use interfaces, planning loops, memory and context management, and multi-step execution frameworks) that determine whether a trained model can operate reliably on complex security tasks. Post-training and agent architecture are not separable problems in this work. The reward signal you design has to reflect what the harness can measure, and the harness has to be built to surface what training needs to optimize. You'll set the technical direction on both, and you'll be in the work on both.

You'll lead a team of research scientists and engineers, but the team will look to your own work as the standard. The successful candidate shapes research priorities, keeps the team moving at high velocity across multiple training cycles per year, and elevates the quality of work by staying close enough to it to know what good actually looks like.

What You'll Do:

  • Own and personally drive the full post-training pipeline for security-domain AI — SFT, RLHF/RLAIF, agent-RL, and reward modeling. Set research priorities and architectural direction, and lead experimental work on the hardest problems yourself rather than delegating them away. Design reward modeling methodology grounded in verified security outcomes rather than proxy signals, drawing on both human expert feedback and automated adversarial evaluation. Define data curation standards across sourcing, filtering, quality scoring, and domain weighting that drive measurable capability improvement.

  • Build and maintain agent-RL training environments that simulate realistic cyber workflows (multi-step offensive and defensive tasks, tool use, and long-horizon planning) contributing directly to environment design and reward shaping. Lead the design and build of the agent harnesses that run on top of those trained models: scaffolding architecture, tool-calling interfaces, planning and reasoning loops, and memory and context management. Treat harness design with the same rigor as the training pipeline; these systems determine whether strong post-training translates into reliable, trustworthy behavior in the field.

  • Develop and own evaluation methodology for the full agentic stack, not model capability in isolation, but harness behavior, tool-use reliability, planning coherence, and end-to-end task completion across realistic security workflows. Define the benchmarks, red-line tests, and measurement practices that give the team and the organization genuine confidence that an agent works.

  • Partner closely with other teams to ensure post-training and agentic work integrates cleanly with the broader model development loop. Contribute original research through publications, external presentations, and open-source artifacts where appropriate, building CrowdStrike's credibility as a research-first organization in this space.

  • Recruit, develop, and retain a high-density team of research scientists and ML engineers. Set a technical bar through your own contributions, not just your standards.

What You'll Need:

  • MS or PhD in computer science, machine learning, or a related quantitative discipline.

  • 8+ years of experience in ML research or engineering, with meaningful depth in large language model post-training.

  • Hands-on expertise across the modern post-training stack, including SFT data pipelines, RLHF/RLAIF, PPO or similar RL algorithms applied to language models, and reward model design and training. This means you've done the work, not managed people who have.

  • Demonstrated experience designing or building agentic system harnesses for LLM-based agents, including tool-use frameworks, planning scaffolds, multi-step execution environments, and context or memory management. You've built these systems, not just used them.

  • Strong evaluation instincts: experience designing evaluation protocols that are resistant to overfitting, capable of measuring genuine capability improvement, and interpretable to both technical and non-technical stakeholders.

  • Track record of running high-velocity research programs with disciplined tracking and fast iteration.

  • Proven ability to lead and grow research teams while remaining a credible, active technical contributor.

Ways to Stand Out:

  • Demonstrated experience building or operating RL training environments for language model agents, including environment design, rollout infrastructure, and reward shaping.

  • Experience applying post-training or RL techniques in security, adversarial ML, or other high-stakes operational domains where ground truth is expensive and noisy.

  • Deep hands-on experience with agent harness architecture applied to long-horizon, multi-step task environments where reliability and failure modes matter as much as peak capability.

  • Background designing synthetic data pipelines or simulation environments for agent training in complex, tool-using workflows.

  • Familiarity with the offensive or defensive security practitioner's workflow — penetration testing, detection engineering, incident response, or threat intelligence — sufficient to reason about what good model behavior looks like in practice.

  • Published research in post-training, RLHF, RL for language agents, or related areas at top-tier venues (NeurIPS, ICML, ICLR, ACL, or equivalent).

  • Experience working on and adapting open-weight base models (Llama-class, Qwen-class, or similar) for domain-specialized continued training and fine-tuning.

#LI-JF1

#LI-Remote

Benefits of Working at CrowdStrike:

  • Market leader in compensation and equity awards

  • Comprehensive physical and mental wellness programs 

  • Competitive vacation and holidays for recharge  

  • Paid parental and adoption leaves

  • Professional development opportunities for all employees regardless of level or role

  • Employee Networks, geographic neighborhood groups, and volunteer opportunities to build connections

  • Vibrant office culture with world class amenities

  • Great Place to Work Certified™ across the globe

CrowdStrike is proud to be an equal opportunity employer. We are committed to fostering a culture of belonging where everyone is valued for who they are and empowered to succeed. We support veterans and individuals with disabilities through our affirmative action program.

CrowdStrike is committed to providing equal employment opportunity for all employees and applicants for employment. The Company does not discriminate in employment opportunities or practices on the basis of race, color, creed, ethnicity, religion, sex (including pregnancy or pregnancy-related medical conditions), sexual orientation, gender identity, marital or family status, veteran status, age, national origin, ancestry, physical disability (including HIV and AIDS), mental disability, medical condition, genetic information, membership or activity in a local human rights commission, status with regard to public assistance, or any other characteristic protected by law. We base all employment decisions--including recruitment, selection, training, compensation, benefits, discipline, promotions, transfers, lay-offs, return from lay-off, terminations and social/recreational programs--on valid job requirements.

If you need assistance accessing or reviewing the information on this website or need help submitting an application for employment or requesting an accommodation, please contact us at [email protected] for further assistance.

Find out more about your rights as an applicant.

CrowdStrike participates in the E-Verify program.

Notice of E-Verify Participation

Right to Work

CrowdStrike, Inc. is committed to fair and equitable compensation practices. Placement within the pay range is dependent on a variety of factors including, but not limited to, relevant work experience, skills, certifications, job level, supervisory status, and location. The base salary range for this position for all U.S. candidates is $195,000 - $290,000 per year, with eligibility for bonuses, equity grants and a comprehensive benefits package that includes health insurance, 401k and paid time off.

For detailed information about the U.S. benefits package, please click here

Expected Close Date of Job Posting is:08-11-2026

CrowdStrike Irvine, California, USA Office

Irvine, CA, United States

Similar Jobs at CrowdStrike

An Hour Ago
Remote or Hybrid
USA
195K-290K Annually
Senior level
195K-290K Annually
Senior level
Cloud • Computer Vision • Information Technology • Sales • Security • Cybersecurity
Lead and conduct mechanistic interpretability and alignment research for security-specialized AI. Develop methods to read model internals, detect misuse signals, design training interventions and evaluation frameworks, publish original research, and recruit and mentor a lean research team.
Top Skills: Activation PatchingAdversarial EvaluationAlignment EvaluationsCausal TracingCircuit AnalysisFeature VisualizationLarge Language ModelsMechanistic InterpretabilityProbing ClassifiersRed Teaming
3 Hours Ago
Remote or Hybrid
CA, USA
125K-180K Annually
Senior level
125K-180K Annually
Senior level
Cloud • Computer Vision • Information Technology • Sales • Security • Cybersecurity
Lead design, governance, and scaling of CrowdStrike's Atlassian Cloud ecosystem for 12,000+ employees. Own architecture, migrations, app governance, integrations, workflow automation, SDLC traceability, and stakeholder mentorship to ensure secure, performant, and standardized platform operations.
Top Skills: Advanced RoadmapsAtlassian CloudAtlassian GuardAtlassian MarketplaceCi/CdConfluenceForgeGroovyJIRAJira Service ManagementJSONOktaPythonRest ApisRovoScriptrunner
5 Hours Ago
Remote or Hybrid
USA
85K-128K Annually
Senior level
85K-128K Annually
Senior level
Cloud • Computer Vision • Information Technology • Sales • Security • Cybersecurity
The AIDR Specialist at CrowdStrike will drive initiatives to secure GenAI technologies and collaborate with sales and marketing teams to enhance AIDR capabilities and customer engagement.
Top Skills: Ai SecurityApplication SecurityCloud SecurityDevsecops

What you need to know about the Los Angeles Tech Scene

Los Angeles is a global leader in entertainment, so it’s no surprise that many of the biggest players in streaming, digital media and game development call the city home. But the city boasts plenty of non-entertainment innovation as well, with tech companies spanning verticals like AI, fintech, e-commerce and biotech. With major universities like Caltech, UCLA, USC and the nearby UC Irvine, the city has a steady supply of top-flight tech and engineering talent — not counting the graduates flocking to Los Angeles from across the world to enjoy its beaches, culture and year-round temperate climate.

Key Facts About Los Angeles Tech

  • Number of Tech Workers: 375,800; 5.5% of overall workforce (2024 CompTIA survey)
  • Major Tech Employers: Snap, Netflix, SpaceX, Disney, Google
  • Key Industries: Artificial intelligence, adtech, media, software, game development
  • Funding Landscape: $11.6 billion in venture capital funding in 2024 (Pitchbook)
  • Notable Investors: Strong Ventures, Fifth Wall, Upfront Ventures, Mucker Capital, Kittyhawk Ventures
  • Research Centers and Universities: California Institute of Technology, UCLA, University of Southern California, UC Irvine, Pepperdine, California Institute for Immunology and Immunotherapy, Center for Quantum Science and Engineering

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account