Humoniq (YC S25) Logo

Humoniq (YC S25)

AI Engineer (Evaluations & Quality)

Posted 2 Days Ago
Hybrid
Mission Viejo, CA
Mid level
Hybrid
Mission Viejo, CA
Mid level
As a Data Engineer, you will build data pipelines, conduct regression tests, implement drift detection, and create dashboards to enhance AI systems.
The summary above was generated by AI

Who We Are

We are a YC-backed startup with $8M+ raised, led by repeat founders who’ve built and scaled successful companies before. Our mission is ambitious: we’re building deeply integrated AI systems that understand, reason, and act to solve real-world problems in travel and transport. We’re not another “move fast and burn out” shop. We believe peak productivity comes when humans have psychological safety, time to sleep, move, eat well, and be understood. That’s the culture we’re building. We don’t believe in overwork or equating hours with outcomes. What matters is results tied to business and customer outcomes—nothing else.

What makes us different:
We don’t worship grind culture. We believe peak output comes when people are well-rested, strong, loved, safe, and understood.
Sleep > All-nighters
Excercise & health > Burnout & “hustle”
Psychological safety > Fear & politics
Because humans at their best → happy, motivated, and productive

Location: Mission Viejo, CA (Los Angeles Outskirts)

As our AI Engineer (Evaluations & Quality), you’ll build the pipelines and tools that let us:

  • Ingest and analyze thousands of AI-driven support conversations

  • Run regression tests on new prompts and models before they hit production

  • Detect drift in user behavior and model outputs before customers feel it

You’ll sit at the intersection of data engineering, ML evaluation, and backend infra. You won’t be tuning models all day — you’ll be building the systems that make tuning safe and fast.

You’ll work closely with:

  • Min (AI lead) on evaluation design and metrics

  • Victor (Technical product/backend Lead) on schemas, APIs, and internal tools

  • Farzad (COO) on priorities and impact

If you forget everything else, remember this:

“If I make it easy for the team to see, measure, and trust that the AI is taking the highest quality actions at scale, and actively improving the AI when needed, I’m winning.”

What you’ll do

Your first 6–12 months, you’ll:

  • Build a log ingestion pipeline

    • Ingest GCP Cloud Run / application logs into a central store (BigQuery / Postgres)

    • Parse logs into ticket-level and message-level records

    • Join in evaluator comments and metadata so we can analyze behavior end-to-end

  • Ship an AI regression and evaluations

    • Re-run historical conversations through new prompts / models

    • Compare End-of-Conversation classification/Issue/Task action-plan outputs over time

    • Generate clear reports that show regressions, hallucinations, and wins

    • Improve our AI agents through prompting and other changes.

  • Implement drift detection

    • Track distributions of intents, outcomes, and actions over time

    • Detect when user behavior or model outputs deviate from baseline

    • Surface drift in dashboards and alerts so we can act before customers are hurt

  • Build internal dashboards & tools

    • Let evaluators and product see problem tickets quickly

    • Make it trivial to search for “all conversations where X went wrong”

    • Visualize trends so we stop arguing anecdotes and start arguing data

  • Own reliability + documentation

    • Add monitoring and alerting around your pipelines

    • Document your data models, assumptions, and runbooks

    • Make it possible for someone new to pick up your work and move forward

You might be a fit if…
  • You’ve owned a data / infra pipeline in production before, not just written a script.

  • You’re comfortable in Python and have used it for ETL, log parsing, or analytics.

  • You’ve worked with cloud infra (GCP preferred; AWS/Azure okay if you can translate).

  • You’ve used data warehouse platforms like BigQuery / Snowflake / Postgres with non-trivial schemas.

  • You think in terms of metrics and failure modes:

    • “What happens if the schema changes?”

    • “How will we know if this silently stops working?”

    • “What’s the rollback if this regression job reveals something bad?”

You don’t need to be an ML research person. We care more that you can:

  • Take messy logs and turn them into structured, usable data

  • Design evaluation flows that are repeatable and automatable

  • Make it obvious when things are getting better or worse

Must-haves
  • Explicit and demonstrable experience in backend, data engineering, or ML infra (or equivalent real-world work)

  • Strong Python skills for scripting and small services

  • Experience with at least one cloud platform (GCP ideal)

  • Experience building and operating ETL / data pipelines in production

  • Comfort with SQL and analytical databases (BigQuery, Snowflake, Redshift, or similar)

  • Clear written communication and willingness to document decisions

Nice-to-haves
  • Experience with:

    • GCP Cloud Run / Cloud Logging / Pub/Sub / Cloud Scheduler

    • BigQuery specifically

    • Data orchestration tools (Airflow, Dagster, Prefect, dbt, etc.)

  • Experience with observability stacks (Grafana, Prometheus, OpenTelemetry, etc.)

  • Familiarity with LLMs, prompt evaluation, or ML monitoring

How we work
  • Small team, high ownership — you won’t be a cog.

  • We care about results, not hours.

  • We give direct feedback, quickly.

  • We expect you to push back with reasons, not vibes.

Top Skills

Airflow
BigQuery
GCP
Grafana
Postgres
Prometheus
Python
Snowflake
SQL

Humoniq (YC S25) Mission Viejo, California, USA Office

Mission Viejo, CA, United States

Similar Jobs

5 Hours Ago
Remote or Hybrid
8 Locations
108K-203K Annually
Mid level
108K-203K Annually
Mid level
eCommerce • Fintech • Hardware • Payments • Software • Financial Services
The Account Services Manager will enhance and retain relationships with Sports and Entertainment sellers, identify growth opportunities, and collaborate with various teams to optimize client experiences.
Top Skills: Ai ToolsGoogle SuiteLookerRevenue.IoSalesforceSnowflake
16 Hours Ago
Hybrid
Visalia, CA, USA
32-48 Hourly
Mid level
32-48 Hourly
Mid level
Fintech • Financial Services
As a Branch Small Business Banker, you will manage relationships with small business clients, sell banking products, provide service, and ensure compliance with regulations.
17 Hours Ago
In-Office or Remote
9 Locations
79K-117K Annually
Senior level
79K-117K Annually
Senior level
Gaming
Seeking a Senior Talent Sourcer to develop sourcing strategies, engage candidates, analyze metrics, and collaborate with recruiting teams to build talent pipelines in the gaming industry.
Top Skills: ArtstationGitGreenhouseLinkedInTalent NeuronWorkday

What you need to know about the Los Angeles Tech Scene

Los Angeles is a global leader in entertainment, so it’s no surprise that many of the biggest players in streaming, digital media and game development call the city home. But the city boasts plenty of non-entertainment innovation as well, with tech companies spanning verticals like AI, fintech, e-commerce and biotech. With major universities like Caltech, UCLA, USC and the nearby UC Irvine, the city has a steady supply of top-flight tech and engineering talent — not counting the graduates flocking to Los Angeles from across the world to enjoy its beaches, culture and year-round temperate climate.

Key Facts About Los Angeles Tech

  • Number of Tech Workers: 375,800; 5.5% of overall workforce (2024 CompTIA survey)
  • Major Tech Employers: Snap, Netflix, SpaceX, Disney, Google
  • Key Industries: Artificial intelligence, adtech, media, software, game development
  • Funding Landscape: $11.6 billion in venture capital funding in 2024 (Pitchbook)
  • Notable Investors: Strong Ventures, Fifth Wall, Upfront Ventures, Mucker Capital, Kittyhawk Ventures
  • Research Centers and Universities: California Institute of Technology, UCLA, University of Southern California, UC Irvine, Pepperdine, California Institute for Immunology and Immunotherapy, Center for Quantum Science and Engineering

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account