The Senior Data Engineer will design data pipelines for the AI ecosystem, manage Vector Databases, and ensure data governance, optimizing schemas for AI consumption.
This is a remote position.
Senior Data Engineer - AI Context & Knowledge Systems
We are looking for a Data Engineer to build the "memory" and "knowledge" backbone of our Agentic AI ecosystem. You will be responsible for designing data pipelines that feed into our Model Context Protocol (MCP) servers, ensuring that AI agents managed via Gravitee have real-time access to accurate, secure, and contextually relevant enterprise data.
Key Responsibilities
- Context Engineering: Design and optimize data schemas specifically for LLM consumption, ensuring that data retrieved via MCP servers is structured to minimize token usage and maximize reasoning accuracy.
- Hybrid Pipeline Development: Build robust data pipelines using Python (for AI/ML workflows) and C#/.NET (for enterprise integration) to move data from legacy systems into AI-ready formats.
- Vector Database Management: Implement and maintain Vector Databases (e.g., Pinecone, Weaviate, or Milvus) to support Retrieval-Augmented Generation (RAG) alongside live API tool calls.
- Data Governance for AI: Work with the Gravitee API Gateway to enforce data masking, PII redaction, and fine-grained access control before data reaches an LLM.
- Metadata Orchestration: Manage the OpenAPI and MCP metadata that allows AI agents to "understand" the data they are querying.
Technical Qualifications
- Languages: Expert-level Python (Pandas, PySpark, SQLAlchemy) and strong familiarity with C# for interacting with .NET-based data layers.
- AI Data Stack: Hands-on experience with Vector Databases and embedding models.
- API Management: Understanding of how data is exposed through Gravitee APIM and secured via MCP-specific authorization flows.
- Modern Data Stack: Experience with SQL/NoSQL databases, dbt, and cloud data warehouses (Snowflake, BigQuery, or Databricks).
- Protocol Knowledge: Familiarity with the Model Context Protocol (MCP) and how it standardizes data retrieval for AI agents.
Preferred Skills
- Experience building Knowledge Graphs to provide relational context to AI agents.
- Familiarity with semantic caching to reduce LLM costs and improve response times.
- Knowledge of Gravitee Observability for monitoring data drift in agentic conversations.
Similar Jobs
Artificial Intelligence • Cloud • Computer Vision • Hardware • Internet of Things • Software
Design, build, and maintain scalable Spark-based ETL pipelines and computed tables in a central data lake. Integrate structured and unstructured IoT, sensor, and external data for analytics, model training, and dashboards. Collaborate with Data Science, Analytics, and ML teams to ensure reliable, high-quality customer-facing datasets.
Top Skills:
AirflowAWSAzureDagsterData LakeDatabricksDelta LakeETLGCPGitGitPrefectPysparkPythonRest ApisSparksqlSQL
Fintech • HR Tech
Build and deliver end-to-end, production-grade data solutions: design and maintain scalable ETL pipelines, ingest from diverse sources, implement dbt transformations, ensure data quality and observability, optimize performance and cost, and partner with analytics, product, and engineering teams to drive business impact.
Top Skills:
AIAlertingAPIsAutomated TestingAutomationBigQueryCi/CdData ObservabilityDatabricksDbtETLEvent StreamsJavaMonitoringPythonRedshiftScalaSnowflakeSQL
Big Data • Cloud • Productivity • Software • Database • Analytics • Automation
As a Senior Data Engineer at Jellyfish, you'll build and maintain data pipelines, optimize orchestration, automate CI/CD processes, and enhance data integration while ensuring high performance and reliability.
Top Skills:
AirflowBigQueryDagsterDatabricksDbtPrefectPysparkPythonRedisSnowflakeSQLTerraform
What you need to know about the Los Angeles Tech Scene
Los Angeles is a global leader in entertainment, so it’s no surprise that many of the biggest players in streaming, digital media and game development call the city home. But the city boasts plenty of non-entertainment innovation as well, with tech companies spanning verticals like AI, fintech, e-commerce and biotech. With major universities like Caltech, UCLA, USC and the nearby UC Irvine, the city has a steady supply of top-flight tech and engineering talent — not counting the graduates flocking to Los Angeles from across the world to enjoy its beaches, culture and year-round temperate climate.
Key Facts About Los Angeles Tech
- Number of Tech Workers: 375,800; 5.5% of overall workforce (2024 CompTIA survey)
- Major Tech Employers: Snap, Netflix, SpaceX, Disney, Google
- Key Industries: Artificial intelligence, adtech, media, software, game development
- Funding Landscape: $11.6 billion in venture capital funding in 2024 (Pitchbook)
- Notable Investors: Strong Ventures, Fifth Wall, Upfront Ventures, Mucker Capital, Kittyhawk Ventures
- Research Centers and Universities: California Institute of Technology, UCLA, University of Southern California, UC Irvine, Pepperdine, California Institute for Immunology and Immunotherapy, Center for Quantum Science and Engineering



