Protege Jobs

Solutions Engineer, Media

Protege

Solutions Engineer, Media

Reposted 7 Days Ago

Remote

Hiring Remotely in USA

Mid level

Remote

Hiring Remotely in USA

Mid level

The Solutions Engineer for media will curate and validate datasets from Protege's catalog, collaborating with sales and partners to meet customer AI data needs effectively.

The summary above was generated by AI

Company Overview:

We are building Protege to solve the biggest unmet need in AI — getting access to the right training data. The process today is time intensive, incredibly expensive, and often ends in failure. The Protege platform facilitates the secure, efficient, and privacy-centric exchange of AI training data.

Solving AI’s data problem is a generational opportunity. We’re backed by world-class investors and already powering partnerships with some of the most ambitious teams in AI. The company that succeeds will be one of the largest in AI — and in tech.

We’re a lean, fast-moving, high-trust team of builders who are obsessed with velocity and impact. Our culture is built for people who thrive on ambiguity, own outcomes, and want to shape the future of data and AI.

Role Overview

We’re hiring a Solutions Engineer for our media vertical to connect Protege’s media catalog with customer AI data needs. This is not a traditional modeling role. It is an applied data curation and delivery role for fast-moving, ambiguous environments where both speed and quality matter.

You will work with imperfect, evolving partner datasets and build strategies to normalize, validate, and operationalize them for downstream AI use cases. You’ll become an expert in Protege’s growing catalog of audio, video, and motion capture content — from longform assets with title-level metadata to clip-level content generated with TwelveLabs embeddings.

At a high level, you will understand what customers are building, identify the content that best fits their needs, and deliver datasets that meet both technical and conceptual requirements, often on tight timelines tied to active deals.

What You’ll Do

Own data quality and curate media datasets

Partner with Sales and Solutions to translate customer requirements into curation strategies
Work with imperfect partner data, including mismatched metadata, schema differences, and incomplete labeling
Normalize and standardize datasets for reliable downstream use
Query and analyze Protege’s media catalog using SQL, internal APIs, and metadata tools to identify relevant content
Build validation checks and workflows to ensure dataset integrity before delivery
Identify, debug, and resolve data quality issues across file structures, metadata, and content alignment
Use AI tools and transcoded embeddings to surface and refine clip-level content
Turn messy, real-world data into structured datasets that meet customer and model requirements
Run iterative sample reviews with customers, incorporate feedback, refine selections, and ensure final packages meet spec

Be the catalog expert

Build deep expertise in Protege’s media catalog structure, metadata, and growth patterns
Track content coverage, diversity, and modality mix, and identify gaps relative to customer demand
Partner with Product and Partnerships to share catalog insights that inform sourcing priorities

Operate across product, data, and customer

Work cross-functionally to ensure content packaging meets technical, ethical, and licensing requirements
Develop methods, scripts, and internal tools that improve curation efficiency and scale
Help shape Protege’s delivery platform, including how internal users and customers search, sample, and export data

Drive human-in-the-loop media search and curation

Work closely with embedding-based systems to iterate between algorithmic selection and human review
Define best practices for embedding queries, relevance evaluation, and content diversity
Maintain a high bar for operational excellence and quality assurance throughout the process

What Success Looks Like

30 days: Learn and get operational

Build a working understanding of the media catalog, delivery lifecycle, and core tools.
Establish strong cross-functional relationships and shadow live curation workflows.

60 days: Deliver and improve

Lead dataset sampling and curation for active use cases, and document reusable workflows.
Surface early insights on catalog coverage, metadata quality, and process improvements.

90 days: Scale and influence

Create repeatable QA and delivery workflows that increase consistency and speed.
Provide actionable feedback that shapes platform, sourcing, and catalog roadmap decisions.

What You Bring

4-7 years of experience in data science, media analytics, technical curation, or similarly hands-on data roles.
Strong SQL proficiency and comfort querying large, messy datasets to generate insight and action.
Experience working with media metadata, embeddings, or unstructured content.
Ability to translate nuanced customer or model requirements into concrete dataset specifications.
High standard for data quality, operational rigor, and usability of delivered outputs.
Clear communicator who can move between technical depth and customer-friendly clarity.
Thrive in ambiguous, fast-moving environments and treats teammates with kindness.

Bonus if you also have:

Familiarity with video/audio processing, embeddings, or multimodal AI workflows.
Prior experience curating or packaging datasets for machine learning.
Background in content analysis, recommendation systems, or information retrieval.

Protege Values

Pass the Loved Ones’ Test
We act with integrity and do the right thing — especially when it’s hard and no one is watching.
Always Find a Way
We are resourceful, resilient builders who solve hard problems and push through obstacles.
Go Fast and Grow Fast
Velocity matters. We move with urgency, learn quickly, and continuously improve as individuals and as a company.
Practice Kindness and Candor
We communicate directly and respectfully, building trust through honest feedback and genuine care for one another.
Deliver Together
We win as one team. Collaboration, accountability, and shared ownership drive our success.
Own the Outcome. Hone the Craft.
We take pride in our work, sweat the details, and continuously raise the bar for excellence.

Similar Jobs

Zscaler

Sales Engineer

14 Minutes Ago

Easy Apply

Remote or Hybrid

Easy Apply

171K-243K Annually

Senior level

171K-243K Annually

Senior level

Cloud • Information Technology • Security • Software • Cybersecurity

Lead technical pre-sales and strategic advisory for Zero Trust Cloud solutions across hybrid and multi-cloud environments. Drive Proof-of-Value engagements, collaborate with sales, SEs, and product teams, influence product roadmap, and present Zero Trust strategies to technical and executive stakeholders to accelerate regional GTM.

Top Skills: AWSAzureGCPSaseSd-WanSseZero Trust ArchitectureZero Trust ExchangeZscaler Internet Access (Zia)Zscaler Private Access (Zpa)

BAE Systems, Inc.

Operations Research Analyst - Construction Cost Engineering

17 Minutes Ago

Remote or Hybrid

122K-208K Annually

Expert/Leader

122K-208K Annually

Expert/Leader

Aerospace • Hardware • Information Technology • Security • Software • Cybersecurity • Defense

Develop and apply stochastic cost models and quantitative analyses to evaluate supplier proposals, estimate construction costs from designs, quantify materials/labor/equipment, analyze subcontractor bid realism, perform close-out cost analysis, and coordinate cost constraints with project managers to support risk-informed decision making.

BAE Systems, Inc.

Business Systems Analyst

17 Minutes Ago

Remote or Hybrid

88K-150K Annually

Senior level

88K-150K Annually

Senior level

Aerospace • Hardware • Information Technology • Security • Software • Cybersecurity • Defense

Build and maintain Nexthink automations, remote actions, dashboards, and integrations (especially with ServiceNow) to enable proactive, preventative, and self-healing digital employee experience. Engineer data workflows, analyze telemetry, create dashboards and reports, and collaborate with IT support teams to reduce ticket volume and improve endpoint health and experience.

Top Skills: Active DirectoryEndpoint DiagnosticsExcelGroup PolicyItsmNexthinkNqlPower BIPowershellServicenowServicenow CmdbSQLWindows InternalsWmi

What you need to know about the Los Angeles Tech Scene

Los Angeles is a global leader in entertainment, so it’s no surprise that many of the biggest players in streaming, digital media and game development call the city home. But the city boasts plenty of non-entertainment innovation as well, with tech companies spanning verticals like AI, fintech, e-commerce and biotech. With major universities like Caltech, UCLA, USC and the nearby UC Irvine, the city has a steady supply of top-flight tech and engineering talent — not counting the graduates flocking to Los Angeles from across the world to enjoy its beaches, culture and year-round temperate climate.

Key Facts About Los Angeles Tech

Number of Tech Workers: 375,800; 5.5% of overall workforce (2024 CompTIA survey)
Major Tech Employers: Snap, Netflix, SpaceX, Disney, Google
Key Industries: Artificial intelligence, adtech, media, software, game development
Funding Landscape: $11.6 billion in venture capital funding in 2024 (Pitchbook)
Notable Investors: Strong Ventures, Fifth Wall, Upfront Ventures, Mucker Capital, Kittyhawk Ventures
Research Centers and Universities: California Institute of Technology, UCLA, University of Southern California, UC Irvine, Pepperdine, California Institute for Immunology and Immunotherapy, Center for Quantum Science and Engineering