Data Engineer L4 – Agentic AI Chat Platform
Location: Calgary, Canada (Remote)
Employment Type: Full-Time
Robots & Pencils is seeking a Level 4 Data Engineer to design, build, and optimize the data infrastructure, supporting an AI-powered chat platform that helps students get instant, intuitive answers across a wide range of topics.
Our client currently operates an existing chatbot that is experiencing critical issues. This project will rebuild and modernize the chat experience, leveraging large language model (LLM) and agentic AI capabilities to deliver natural, conversational interactions that meet modern user expectations.
As a senior member of the data engineering team, you’ll take ownership of the data ecosystem—designing and implementing pipelines, integrating data systems, and enabling scalable data access for AI and conversational models. You’ll collaborate closely with AI engineers, ML specialists, and software developers to ensure that data is reliable, high-quality, and efficiently orchestrated across the platform.
Key Responsibilities
Data Architecture & Pipeline Development
- Design, build, and maintain scalable data pipelines supporting ingestion, transformation, and retrieval for LLM-based chat systems.
- Develop and optimize data models and architectures to enable real-time and batch data processing.
- Collaborate with AI engineers to integrate structured and unstructured data sources for training, inference, and contextual memory.
- Implement and enforce data quality, validation, and governance standards.
- Build and manage data APIs and microservices that interface with agentic AI components.
- Support data versioning, lineage, and observability practices to ensure system transparency and reliability.
Technical Strategy & Innovation
- Evaluate and implement data tools, frameworks, and cloud services that align with the agentic AI ecosystem.
- Contribute to the design of retrieval-augmented generation (RAG) and vector data pipelines.
- Explore new technologies for embedding management, semantic search, and contextual data enrichment.
- Drive automation in data operations through orchestration frameworks and CI/CD pipelines.
- Partner with architects and ML engineers to ensure alignment between data and model lifecycle workflows.
Collaboration & Execution
- Work closely with AI, backend, and product teams to translate data needs into technical solutions.
- Partner with DevOps to optimize cloud infrastructure for performance and scalability.
- Participate in sprint planning, backlog reviews, and data-centric technical design sessions.
- Mentor junior data engineers on best practices for ETL design, testing, and documentation.
Process & Governance
- Establish data reliability and observability standards for AI-driven workloads.
- Contribute to system modernization efforts by refactoring legacy data processes into cloud-native architectures.
- Participate in design reviews, documentation, and knowledge-sharing sessions.
- Ensure compliance with data privacy, security, and governance frameworks (GDPR, PIPEDA).
Required Skills & Qualifications
- 5+ years of experience in data engineering, with proven work in cloud or AI environments.
- Proficiency in building data pipelines using Airflow, dbt, Spark, or Kafka.
- Strong skills in SQL and one or more programming languages such as Python, Scala, or Java.
- Hands-on experience with cloud data platforms (AWS Redshift, BigQuery, Azure Synapse, or Snowflake).
- Deep understanding of ETL/ELT design, data modeling, and API-based data integration.
- Familiarity with LLM ecosystems, agentic AI concepts, and vector databases.
- Experience implementing scalable, secure, and highly available data systems.
- Excellent documentation and collaboration skills in agile team environments.
Nice to Have
- Experience with retrieval-augmented generation (RAG) pipelines and embedding-based data flows.
- Knowledge of data observability tools, lineage frameworks, and ML data management.
- Prior experience in education technology or conversational AI platforms.
- Familiarity with agentic orchestration frameworks such as LangChain, LlamaIndex, or Semantic Kernel.
- Understanding distributed computing frameworks and real-time event processing.
- Contributions to open-source data or AI tooling.
Personal Competencies
- System Design Thinking – Creates robust, modular data systems that scale with evolving AI needs.
- Collaboration – Works effectively across teams to ensure data readiness for AI and analytics.
- Innovation – Embraces modern data tools and AI-driven data pipelines.
- Quality Focus – Ensures accuracy, consistency, and performance in all data deliverables.
- Adaptability – Thrives in a dynamic environment with emerging AI technologies.
- Strategic Mindset – Aligns data solutions with long-term platforms and business goals.
- Technical Leadership – Provides mentorship and promotes engineering excellence.
Top Skills
Similar Jobs
What you need to know about the Los Angeles Tech Scene
Key Facts About Los Angeles Tech
- Number of Tech Workers: 375,800; 5.5% of overall workforce (2024 CompTIA survey)
- Major Tech Employers: Snap, Netflix, SpaceX, Disney, Google
- Key Industries: Artificial intelligence, adtech, media, software, game development
- Funding Landscape: $11.6 billion in venture capital funding in 2024 (Pitchbook)
- Notable Investors: Strong Ventures, Fifth Wall, Upfront Ventures, Mucker Capital, Kittyhawk Ventures
- Research Centers and Universities: California Institute of Technology, UCLA, University of Southern California, UC Irvine, Pepperdine, California Institute for Immunology and Immunotherapy, Center for Quantum Science and Engineering