Monarch Money

AI Platform Engineer (Senior/Staff)

Posted 2 Hours Ago

Be an Early Applicant

Remote

Hiring Remotely in USA

90K-210K Annually

Senior level

Remote

Hiring Remotely in USA

90K-210K Annually

Senior level

Design and build a unified AI platform, improving machine learning systems, ensuring production excellence, and collaborating with infrastructure teams. Focus on retrieval systems and observability for mission-critical services.

The summary above was generated by AI

About Us:

Monarch is a powerful, all-in-one personal finance platform designed to help make the complexity of finances feel simple again. Since launching in 2021, we’ve become the top-recommended personal finance app by users and experts. Our goal? To take the stress out of finances so our members can focus on what truly matters.

We are a team of do-ers led by experienced entrepreneurs who are passionate about helping our members reach their financial goals. We are hyper focused on building a product people love and continuing to evolve based on user feedback.

As a fully remote company (even before COVID!), we welcome applicants from almost anywhere. Our team collaborates synchronously mostly from 9 AM – 2 PM PT and embraces asynchronous work to stay connected across time zones.

Join us on our mission to transform lives by simplifying money, together.

The Role:

At Monarch, AI is the engine that will power the next generation of intelligent and intuitive product experiences for our users. We're looking for our first AI Platform Engineer to design, build, and own the central nervous system for our machine learning and large language model (LLM) initiatives.

This isn't a typical MLOps role. You won't just be managing pipelines; you'll be the architect of our AI infrastructure, creating the paved road that enables our product and AI teams to ship features faster, safer, and smarter. You will be the force multiplier for all things AI at Monarch, making critical decisions on everything from our retrieval architecture to how we manage a multi-model LLM strategy.

You'll collaborate closely with our Infrastructure team, who manage the underlying cloud, networking, and compute fabric. Your focus will be on the entire ML/LLM application layer: reliability, evaluation, safety, observability, and performance.

What You'll Do:

Build the Central AI Platform: Design and build a unified, resilient platform for deploying and serving AI features. This includes creating a routing layer with provider fallbacks, circuit breakers, and cost/latency-aware model selection. You'll also establish a central registry for versioning models and prompts and create robust CI/CD pipelines.
Architect for Scale and Quality: Own our end-to-end Retrieval-Augmented Generation (RAG) strategy. You'll lead the design of embedding pipelines, develop optimal chunking strategies, implement hybrid search, and manage index maintenance. Crucially, you'll build and scale our LLM evaluation tooling, using methods like golden sets, rubric-based scoring, and LLM-as-judge with bias controls.
Ensure Production Excellence: Instrument our AI systems with deep observability, including structured tracing, cost-attribution, and latency metrics. Define and uphold SLOs, create incident response runbooks, and build the guardrails necessary for running mission-critical AI services.

A Partnership with Infrastructure

You Own: The LLM runtime, retrieval architecture (vector stores, indexing), evaluation frameworks, safety guardrails, prompt/model versioning, AI observability, and cost/latency optimization.
Infra Owns: The core cloud infrastructure (IaC), networking, secrets management, Kubernetes/GPU orchestration, and shared platform services.
Together You Own: SLAs/SLOs, rollout strategies, incident response protocols, and capacity planning for all AI services.

What You'll Bring:

5+ years of experience in software or machine learning engineering, with at least 2 years in a role focused on building and operating production ML/LLM systems.
A proven track record of shipping and scaling LLM-backed applications, with deep, hands-on expertise in the surrounding ecosystem.
Expertise in modern LLM retrieval systems, including hands-on work with embedding pipelines, hybrid search, chunking strategies, and index maintenance.
Demonstrated experience building robust LLM eval tooling (e.g., golden sets, rubric scoring, LLM-as-judge).
Practical knowledge of building resilient LLM routing and orchestration layers, incorporating provider fallbacks, circuit breakers, and cost/latency-aware selection.
Strong programming skills in Python and a history of building production-grade automation and services.
A strategic mindset, comfortable making build-vs-buy decisions and designing systems for long-term reliability and cost efficiency.

Nice to Have's:

Reproducible Training & Fine-Tuning: Experience building containerized, reproducible training jobs with robust experiment tracking (e.g., Weights & Biases, MLflow), dataset versioning, and standardized evaluation harnesses (e.g., lm-eval, HELM).
ML Serving & Orchestration: Kubernetes-native serving (KServe, Seldon), model servers (Triton), and workflow orchestrators.
Vector Databases: Hands-on experience with systems like OpenSearch, pgvector, Pinecone, or Weaviate at scale.
Agentic Systems: Designing and building multi-step, tool-using agents (e.g., using frameworks like LangGraph).
Security & Safety: Experience with red-teaming exercises, building adversarial tests, and implementing robust safety filters.

Typical Process:

Recruiter Video Call
Hiring Manager Video Call
Take-Home or Pairing Exercise
Virtual Onsite (2-3 rounds)
Reference Checks
Offer

Benefits :

Work wherever you want! As a fully remote company with no central office, we want you to work wherever you are happiest and most productive. Whether that’s out of your home, a co-working space, or elsewhere.
Competitive cash and equity compensation in a hyper growth, early stage company 🚀.
Stipend to set-up your ideal working environment.
Competitive Benefit Plans for employees based on your location (e.g. in the US we offer: Medical, dental and vision benefits and the ability to contribute to a 401k plan).
Unlimited PTO.
3 day weekend every month! We take off the “First Friday” every month to focus on rest, recuperation, or just having fun!

We are an equal opportunity employer and value diversity. We do not discriminate on the basis of race, religion, color, national origin, gender, sexual orientation, age, marital status, veteran status, or disability status.

Top Skills

Ci/Cd

Golden Sets

Hybrid Search

Kubernetes

Llm-As-Judge

Mlflow

Opensearch

Pgvector

Pinecone

Python

Rag

Weaviate

Weights & Biases

Similar Jobs

Coinbase

Staff Software Engineer

21 Days Ago

Remote

United States

254K-299K Annually

Senior level

254K-299K Annually

Senior level

Artificial Intelligence • Blockchain • Fintech • Financial Services • Cryptocurrency • NFT • Web3

The Senior Staff Software Engineer will architect distributed systems, manage projects, mentor team members, and develop foundational infrastructure for Coinbase's platform.

Top Skills: DockerGoPostgresRuby on RailsRubySinatra

Kustomer

Software Engineer

8 Days Ago

In-Office or Remote

200K-250K

Senior level

200K-250K

Senior level

Artificial Intelligence • Enterprise Web • Machine Learning • Natural Language Processing • Software • Conversational AI • Automation

Design and implement AI systems for Kustomer's platform, focusing on autonomous agents, data pipelines, and system integration while ensuring reliability and performance.

Top Skills: Ai FrameworksAWSDistributed ComputingGoNode.jsNoSQLPythonSemantic Data ProcessingSQLTypescript

Atlassian

Data Scientist

7 Minutes Ago

In-Office or Remote

San Francisco, CA, USA

128K-200K Annually

Mid level

128K-200K Annually

Mid level

Cloud • Information Technology • Productivity • Security • Software • App development • Automation

The Data Scientist will analyze people data, design insights, and collaborate with various stakeholders to influence business decisions and improve programs at Atlassian.

Top Skills: DatabricksHexLookerPower BIPythonRSQLStreamlitTableau

What you need to know about the Los Angeles Tech Scene

Los Angeles is a global leader in entertainment, so it’s no surprise that many of the biggest players in streaming, digital media and game development call the city home. But the city boasts plenty of non-entertainment innovation as well, with tech companies spanning verticals like AI, fintech, e-commerce and biotech. With major universities like Caltech, UCLA, USC and the nearby UC Irvine, the city has a steady supply of top-flight tech and engineering talent — not counting the graduates flocking to Los Angeles from across the world to enjoy its beaches, culture and year-round temperate climate.

Key Facts About Los Angeles Tech

Number of Tech Workers: 375,800; 5.5% of overall workforce (2024 CompTIA survey)
Major Tech Employers: Snap, Netflix, SpaceX, Disney, Google
Key Industries: Artificial intelligence, adtech, media, software, game development
Funding Landscape: $11.6 billion in venture capital funding in 2024 (Pitchbook)
Notable Investors: Strong Ventures, Fifth Wall, Upfront Ventures, Mucker Capital, Kittyhawk Ventures
Research Centers and Universities: California Institute of Technology, UCLA, University of Southern California, UC Irvine, Pepperdine, California Institute for Immunology and Immunotherapy, Center for Quantum Science and Engineering