Articul8 AI

Machine Learning Engineer - Data Pipeline

Reposted 15 Hours Ago

Dublin, CA

Mid level

Dublin, CA

Mid level

The Machine Learning Engineer will design and develop data processing pipelines, implement ML models, and lead data acquisition engineering projects.

The summary above was generated by AI

About us:

At Articul8 AI, we relentlessly pursue excellence and create exceptional AI products that exceed customer expectations. We are a team of dedicated individuals who take pride in our work and strive for greatness in every aspect of our business. We believe in using our advantages to make a positive impact on the world and inspiring others to do the same.

Job Description:

We are seeking machine learning engineers to join our team full-time. As part of your role, you will help us build pipelines of data collection, data extraction, data filtering/synthetic data generation and data analysis. You will own all work related to acquiring high-quality data to power the training of our domain-specific models end to end. You will work closely with other researchers and engineers to empower our next generation of domain-specific models. We value rapid prototyping, iterating, and shipping new systems quickly.

Required Qualifications:

BS/MS/PhD in Computer Science or a related field.

Proficiency in at least one deep learning framework, such as PyTorch.

Experience in machine learning projects in text or vision, e.g., has trained machine learning models to tackle a specific problem.

Strong expertise in large stateful distributed systems and data processing.

Strong proficiency in building large-scale data processing pipelines, familiar with distributed workload (e.g., multiprocessing, Ray, Docker, Kubernetes).

Proficiency in at least one programming language commonly used in machine learning, such as Python and ability to write clean, maintainable code.

Excellent problem-solving skills and attention to detail, especially when handling data anomalies and biases to further improve data quality.

Key Competencies

Active Github contributions are a big plus.

Experience in building large-scale datasets.

Familiar with at least one of the following tools for data crawling (e.g. Scrapy), data collection (e.g., VPNs, Selenium), data processing (e.g., Hadoop, Datasketch).

Building bespoke data processing libraries from scratch.

Keeping up with state-of-the-art techniques for preparing AI training data.

Organizing and meticulously bookkeeping data across multiple clouds, of multiple modalities, and from many sources.

Multilingual which contributes to enriching the language diversity crucial for robust model training.

Responsibilities:

Design and develop data processing pipelines, including data extraction, data filtering, data labeling, etc.

Implement machine learning models to improve the quality and diversity of data (especially in the data extraction stage), e.g., quality classifier, document layout model, code verification model, etc.

Own and lead engineering projects in the area of data acquisition, including web crawling, data ingestion, and processing.

Collaborate with our Applied Research, Technology, and Architecture teams to ensure smooth data flow and system operability.

Develop and deploy highly scalable distributed systems capable of handling terrabytes of data.

Architect and implement algorithms for data indexing and search capabilities.

Build and maintain backend services for data storage, including work with key-value databases and synchronization.

Deploy solutions in a Kubernetes Infrastructure-as-Code environment and perform routine system checks.

By joining our team, you become part of a community that embraces diversity, inclusiveness, and lifelong learning. We nurture curiosity and creativity, encouraging exploration beyond conventional wisdom. Through mentorship, knowledge exchange, and constructive feedback, we cultivate an environment that supports both personal and professional development.

Your future experience at Articul8 will include continuous learning and growth opportunities as we embark on an exciting journey to disrupt the status quo. If you're excited about joining a team that's passionate about making a difference, we want to hear from you.

If you're ready to join a team that's changing the game, apply now to become a part of the Articul8 team. Join us on this adventure and help shape the future of Generative AI in the enterprise.

Top Skills

Docker

Hadoop

Kubernetes

Python

PyTorch

Ray

Scrapy

Selenium

Similar Jobs

BlackLine

Senior Software Engineer

3 Hours Ago

Remote

Hybrid

Pleasanton, CA, USA

157K-196K Annually

Senior level

157K-196K Annually

Senior level

Cloud • Fintech • Information Technology • Machine Learning • Software • App development • Generative AI

The Senior Software Engineer will design, develop, and optimize scalable backend services for the BlackLine application, ensuring high standards of quality and performance, collaborating in an agile environment.

Top Skills: ApigeeAWSAzureC#C++Elastic SearchGCPJavaKafkaMicroservicesNo-SqlOktaRabbitMQRestful ApisSQL

Klaviyo

Lead Software ML Engineer

3 Hours Ago

Hybrid

San Francisco, CA, USA

188K-282K Annually

Senior level

188K-282K Annually

Senior level

Consumer Web • eCommerce • Marketing Tech • Retail • Software • Analytics • Generative AI

As a Lead Software ML Engineer, you will develop AI-driven strategies, create AI agents for marketing, lead a diverse team, and stay updated on AI technologies to enhance Klaviyo's offerings.

Top Skills: AIDeep LearningGenerative AiHuggingfaceKerasMachine LearningNlpPyTorchRaySparkTensorFlow

ServiceNow

Senior Staff Engineer - Operations & Reliability (DevOps Focus)

7 Hours Ago

Remote

Hybrid

Santa Clara, CA, USA

163K-285K Annually

Senior level

163K-285K Annually

Senior level

Artificial Intelligence • Cloud • HR Tech • Information Technology • Productivity • Software • Automation

Lead the implementation of DevOps practices, focusing on reliability, observability, and automation within product engineering. Collaborate across teams, drive initiatives, and mentor engineers to elevate operational maturity.

Top Skills: AIAWSDevOpsDockerKubernetesObservability ToolingVMware

What you need to know about the Los Angeles Tech Scene

Los Angeles is a global leader in entertainment, so it’s no surprise that many of the biggest players in streaming, digital media and game development call the city home. But the city boasts plenty of non-entertainment innovation as well, with tech companies spanning verticals like AI, fintech, e-commerce and biotech. With major universities like Caltech, UCLA, USC and the nearby UC Irvine, the city has a steady supply of top-flight tech and engineering talent — not counting the graduates flocking to Los Angeles from across the world to enjoy its beaches, culture and year-round temperate climate.

Key Facts About Los Angeles Tech

Number of Tech Workers: 375,800; 5.5% of overall workforce (2024 CompTIA survey)
Major Tech Employers: Snap, Netflix, SpaceX, Disney, Google
Key Industries: Artificial intelligence, adtech, media, software, game development
Funding Landscape: $11.6 billion in venture capital funding in 2024 (Pitchbook)
Notable Investors: Strong Ventures, Fifth Wall, Upfront Ventures, Mucker Capital, Kittyhawk Ventures
Research Centers and Universities: California Institute of Technology, UCLA, University of Southern California, UC Irvine, Pepperdine, California Institute for Immunology and Immunotherapy, Center for Quantum Science and Engineering