Optum Logo

Optum

Principal AI / Machine Learning Data Engineer - Remote or hybrid from MN or DC

Posted 2 Hours Ago
Be an Early Applicant
Remote or Hybrid
Hiring Remotely in Eden Prairie, MN
113K-193K Annually
Senior level
Remote or Hybrid
Hiring Remotely in Eden Prairie, MN
113K-193K Annually
Senior level
Design, build, and operate scalable data pipelines and AI-ready data products from large structured and unstructured sources (OCR/images/documents). Enable production Generative AI (RAG, semantic search), ensure data quality/observability, orchestrate CI/CD and infra-as-code, and mentor engineers while collaborating with product, analytics, and compliance teams.
The summary above was generated by AI
Requisition Number: 2373757
Optum is a global organization that delivers care, aided by technology to help millions of people live healthier lives. The work you do with our team will directly improve health outcomes by connecting people with the care, pharmacy benefits, data and resources they need to feel their best. Here, you will find a culture guided by inclusion, talented peers, comprehensive benefits and career development opportunities. Come make an impact on the communities we serve as you help us advance health optimization on a global scale. Join us to start Caring. Connecting. Growing together.
The Enterprise Information Security (EIS) team is responsible for cybersecurity across our organization. We support our business and members by reducing risk, rapidly responding to threats, focusing on business resiliency and securing new acquisitions.
The Principal AI Data Engineer will design and build end-to-end AI pipelines for large-scale unstructured data, enabling advanced analytics, Generative AI, and investigative insights.
This role will transform raw, complex datasets-such as scanned documents, images, PRFs and other OCR- driven unstructured data sources-into AI-ready, searchable, and model-integrated data products. You will play a key role in building LLM-powered systems (e.g., RAG, semantic search, summarization, and insight extraction) and scaling them into production environments.
This position sits at the intersection of data engineering and AI, with an emphasis on building modern data pipelines and enabling production-grade AI capabilities.
You'll enjoy the flexibility to work remotely * from anywhere within the U.S. as you take on some tough challenges. For all hires in the Minneapolis or Washington, D.C. area, you will be required to work in the office a minimum of four days per week.
Primary Responsibilities:
  • Design, develop, and maintain scalable data pipelines and data platforms supporting analytics, machine learning, and AI use cases
  • Build and optimize ingestion frameworks for large-scale structured and unstructured data, including streaming and event-driven sources
  • Partner with cross-functional stakeholders to understand evolving data and AI needs and define long-term technical solutions
  • Enable and support machine learning and AI workflows, including feature engineering, data preparation, and model deployment support
  • Drive strategic initiatives around Generative AI, data quality, observability, lineage, and governance
  • Develop and maintain frameworks that support rapid experimentation and deployment of AI/ML solutions
  • Introduce and evolve best practices in data modeling, orchestration, testing, and monitoring
  • Identify and champion opportunities for platform scalability, performance optimization, and cost efficiency
  • Collaborate with product, analytics, and infrastructure teams to deliver high-impact data and AI solutions
  • Build and maintain reusable parsing, enrichment, analytic, and service libraries to accelerate delivery across teams
  • Work comfortably under time-sensitive conditions while ensuring thoroughness
  • Maintain high ethical standards and the ability to remain objective and confidential
  • You will be building and operating production data platforms and pipelines across batch and streaming workloads
  • Working hands-on engineering in Python and SQL; in a JVM languages (Java/Scala) Spark ecosystems
  • Distributed processing and lakehouse/warehouse patterns (eg, Spark/PySpark, Databricks, Snowflake)
  • Build pipelines for OCR, document parsing, and text extraction from image-based or scanned data sources
  • Enabling Generative AI solutions in production (eg, RAG-style architectures), including retrieval patterns and evaluation/monitoring practices
  • Take a knowledge-centric data approaches (eg, metadata-driven systems, entity resolution, and/or graph concepts) to improve discoverability and downstream analytics
  • Data quality, observability, and monitoring mindset (profiling, validation, alerting, and reliability improvements)
  • Orchestrate, CI/CD, containerization, and infrastructure-as-code (eg, Airflow, GitHub Actions, Docker, Terraform, Kubernetes)
  • Work in the Cloud (AWS, Azure, and/or GCP), including secure handling of sensitive data (PII/PHI) and collaboration with compliance partners
  • Lead through influence, mentor engineers, and translate ambiguous problems into scalable technical roadmaps

You'll be rewarded and recognized for your performance in an environment that will challenge you and give you clear direction on what it takes to succeed in your role as well as provide development for other roles you may be interested in.
Required Qualifications:
  • Bachelor's degree or equivalent experience
  • 5+ years of experience designing, building, and operating scalable data pipelines and platforms (batch + streaming)
  • 2+ years of experience deploying Generative AI solutions to production (e.g., RAG, LLM-powered pipelines, semantic search)
  • Proven solid hands-on development in Python and SQL, with experience in Spark/PySpark and Databricks (or similar distributed platforms)
  • Experience building ingestion and processing frameworks for unstructured data (OCR, documents, images), including parsing and enrichment
  • Experience with cloud platforms (AWS/Azure/GCP), DevOps/CI/CD, and infrastructure-as-code, including secure handling of sensitive data (PII/PHI)
  • Proven ability to design scalable solutions, implement data quality/observability practices, and collaborate across stakeholders

Preferred Qualifications:
  • Experience with cloud platforms such as AWS, Azure, or Google Cloud, including managed data services
  • Experience with streaming and event-driven architectures (e.g., Kafka, Kinesis, Event Hubs)
  • Experience with data quality and validation frameworks (e.g., Great Expectations, Deequ) and/or data observability tooling
  • Experience enabling MLOps practices (e.g., feature stores, model registries, experiment tracking, deployment automation)
  • Experience with lakehouse architectures, Delta Lake, and advanced Spark optimization/performance tuning
  • Experience with data visualization tools and libraries such as Plotly, seaborn, and Chartjs
  • Experience with machine learning and predictive analytics
  • Familiarity with security and privacy concepts for data platforms (e.g., least privilege, PII/PHI handling) and working with compliance partners
  • Solid hands-on engineering in Python and SQL; familiarity with JVM languages (Java/Scala) in Spark ecosystems

*All employees working remotely will be required to adhere to UnitedHealth Group's Telecommuter Policy
Pay is based on several factors including but not limited to local labor markets, education, work experience, certifications, etc. In addition to your salary, we offer benefits such as, a comprehensive benefits package, incentive and recognition programs, equity stock purchase and 401k contribution (all benefits are subject to eligibility requirements). No matter where or when you begin a career with us, you'll find a far-reaching choice of benefits and incentives. The salary for this role will range from $112,700 - $193,200 annually based on full-time employment. We comply with all minimum wage laws as applicable.
Application Deadline: This will be posted for a minimum of 2 business days or until a sufficient candidate pool has been collected. Job posting may come down early due to volume of applicants.
At UnitedHealth Group, our mission is to help people live healthier lives and make the health system work better for everyone. We believe everyone-of every race, gender, sexuality, age, location and income-deserves the opportunity to live their healthiest life. Today, however, there are still far too many barriers to good health which are disproportionately experienced by people of color, historically marginalized groups and those with lower incomes. We are committed to mitigating our impact on the environment and enabling and delivering equitable care that addresses health disparities and improves health outcomes - an enterprise priority reflected in our mission.
UnitedHealth Group is an Equal Employment Opportunity employer under applicable law and qualified applicants will receive consideration for employment without regard to race, national origin, religion, age, color, sex, sexual orientation, gender identity, disability, or protected veteran status, or any other characteristic protected by local, state, or federal laws, rules, or regulations.
UnitedHealth Group is a drug - free workplace. Candidates are required to pass a drug test before beginning employment.
#BI-Hybrid

Similar Jobs at Optum

2 Hours Ago
In-Office or Remote
113K-193K Annually
Senior level
113K-193K Annually
Senior level
Artificial Intelligence • Big Data • Healthtech • Information Technology • Machine Learning • Software • Analytics
Lead financial planning, performance management, and enterprise initiatives for Claim & Appeal Operations. Drive cost-savings, reporting, strategic planning, project portfolio oversight, workforce planning, and change management. Support the VP with executive reporting, leadership forums, cross-functional coordination, and data-driven recommendations to improve operational efficiency and outcomes.
Top Skills: ExcelOutlookPowerPointVisioWord
2 Hours Ago
In-Office or Remote
60K-107K Annually
Mid level
60K-107K Annually
Mid level
Artificial Intelligence • Big Data • Healthtech • Information Technology • Machine Learning • Software • Analytics
Provide telephonic utilization management for outpatient behavioral health services for children and adolescents. Review treatment requests, authorize appropriate level of care per medical necessity guidelines, coordinate discharge and transitions, refer to psychiatric and community resources, and collaborate with providers to remove barriers and ensure cost-effective, quality outcomes.
Top Skills: Ms Office Suite
2 Hours Ago
In-Office or Remote
159K-273K Annually
Senior level
159K-273K Annually
Senior level
Artificial Intelligence • Big Data • Healthtech • Information Technology • Machine Learning • Software • Analytics
Lead strategy, incubation, and product development for S3 Directed Spend payment network. Identify market opportunities, design AI-enabled and payments-focused innovations, drive proofs of concept to commercialization, and align cross-functional teams. Serve as industry SME, influence senior leadership with data-driven business cases, develop long-range innovation roadmaps, and build strategic partnerships to scale transformative payment capabilities.
Top Skills: ClaudeCodex

What you need to know about the Los Angeles Tech Scene

Los Angeles is a global leader in entertainment, so it’s no surprise that many of the biggest players in streaming, digital media and game development call the city home. But the city boasts plenty of non-entertainment innovation as well, with tech companies spanning verticals like AI, fintech, e-commerce and biotech. With major universities like Caltech, UCLA, USC and the nearby UC Irvine, the city has a steady supply of top-flight tech and engineering talent — not counting the graduates flocking to Los Angeles from across the world to enjoy its beaches, culture and year-round temperate climate.

Key Facts About Los Angeles Tech

  • Number of Tech Workers: 375,800; 5.5% of overall workforce (2024 CompTIA survey)
  • Major Tech Employers: Snap, Netflix, SpaceX, Disney, Google
  • Key Industries: Artificial intelligence, adtech, media, software, game development
  • Funding Landscape: $11.6 billion in venture capital funding in 2024 (Pitchbook)
  • Notable Investors: Strong Ventures, Fifth Wall, Upfront Ventures, Mucker Capital, Kittyhawk Ventures
  • Research Centers and Universities: California Institute of Technology, UCLA, University of Southern California, UC Irvine, Pepperdine, California Institute for Immunology and Immunotherapy, Center for Quantum Science and Engineering

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account