LexisNexis Logo

LexisNexis

Senior Data Scientist II

Reposted 12 Days Ago
Remote
2 Locations
105K-175K Annually
Senior level
Remote
2 Locations
105K-175K Annually
Senior level
Lead design and evolution of a multimodal document understanding and structured extraction platform. Build multimodal model strategy and multi‑agent orchestration, fine‑tune models, define metrics, ensure quality controls, and integrate outputs into business systems.
The summary above was generated by AI

Are you looking to develop your Data Scientist career?

Would you enjoy training more junior team members?

About the Team:

LexisNexis Legal & Professional, which serves customers in more than 150 countries with 11,800 employees worldwide, is part of RELX (www.relx.com), a global provider of information-based analytics and decision tools for professional and business customers. Our company has been a long-time leader in deploying AI and advanced technologies to the legal market to improve productivity and transform the overall business and practice of law, deploying ethical and powerful generative AI solutions with a flexible, multi-model approach that prioritizes using the best model from today’s top model creators for each individual legal use case. The company employs over 2,000 technologists, data scientists, and experts to develop, test, and validate solutions in line with RELX Responsible AI Principles (https://stories.relx.com/responsible-ai-principles/index.html).

About the Role:

Join our team to help build state-of-the-art research tools. Our Data Science teams focus on extracting key information such as entities mentioned, sentiment analysis, data enrichments, predictive insights, and more to build best in class data and news streams relied on by our global customer base. Responsible for the end‑to‑end design and continuous evolution of a multimodal document understanding and structured data extraction platform: complex PDF / scanned page layout analysis, semantic extraction, structural reconstruction, quality validation, and business integration. Leads multimodal model strategy (vision + language + layout) and multi‑agent collaboration (task decomposition, verification, conflict reconciliation, feedback loops) and plans future customized training and ongoing optimization of models.

Responsibilities:
  • Design and iterate the multimodal document parsing pipeline: layout / structural modeling, semantic extraction, cross‑modal alignment, structural reconstruction.

  • Build and optimize a multi‑agent collaboration mechanism: task splitting, parallel / sequential scheduling, peer review, iterative quality improvement loops.

  • Define model selection / composition / routing strategies (dynamic dispatch by document type, structural patterns, quality signals).

  • Plan and execute model fine‑tuning, domain adaptation, continual learning, active learning, and data feedback loops.

  • Establish end‑to‑end metrics: extraction accuracy, structural consistency, agent collaboration effectiveness, latency, stability, and cost.

  • Build quality assurance and risk controls: drift & anomaly monitoring, confidence estimation, fallback strategies, alignment / compliance checks.

  • Drive mapping and consistency between agent / model outputs and business knowledge field standards.

Requirements:
  • Education: Master’s degree or above in a quantitative or technical field (Statistics, Computer Science, Mathematics, Data Science, etc.).

  • Experience: 5+ years of hands‑on machine learning / data science experience. Proven delivery experience in multimodal (vision + text) or complex document understanding. Practical cases of orchestrating agents (or modular processing logic) in production workflows.

  • Solid foundation in machine learning / deep learning fundamentals, multimodal representations, and cross‑modal alignment concepts.

  • Deep understanding of core principles and common algorithms for multimodal large models: cross‑modal attention & representation alignment, vision/text embedding fusion, hierarchical & layout structure modeling, instruction & contrastive paradigms, long‑context and retrieval‑augmented mechanisms, evaluation and failure mode dissection.

  • Familiar with classic image and signal processing methods: edge & contour detection, filtering & denoising, morphological operations, segmentation & key point feature extraction, frequency / time‑frequency analysis, image enhancement & quality assessment; understands trade‑offs and complementarity with deep features.

  • Knowledge of multi‑agent collaboration patterns: role assignment, task routing, feedback loops, redundancy & cross‑checks. Strong in statistical analysis & experimental design: hypothesis testing, factorial design, power analysis, A/B and multivariate evaluation.

  • Able to decompose complex problems and build metric‑driven optimization paths. Rigorous in data quality & error analysis; rapid bottleneck identification.

  • Ability to translate research pseudo‑code into maintainable, testable Python modules with benchmarking & regression harnesses.

Preferred Experience:

  • Designed customization / fine‑tuning of multimodal foundation models, representation learning, or structural understanding subsystems.

  • Built an agent orchestration platform: task decomposition, iterative self‑checks, consensus or voting mechanisms.

  • Experience solving robustness & generalization challenges in large‑scale long documents / heterogeneous layouts.

  • Demonstrated results in cost optimization (model pruning, parameter‑efficient tuning, inference acceleration) or adaptive load scheduling.

  • Publications / patents or open‑source contributions.

  • Demonstrated Python systems optimization (e.g., custom Cython / CUDA kernels, vectorization replacing Python loops, latency reductions in inference pipelines).

U.S. National Base Pay Range: $104,900 - $174,700. Geographic differentials may apply in some locations to better reflect local market rates. This job is eligible for an annual incentive bonus.

We know your well-being and happiness are key to a long and successful career. We are delighted to offer country specific benefits. Click here to access benefits specific to your location.

We are committed to providing a fair and accessible hiring process. If you have a disability or other need that requires accommodation or adjustment, please let us know by completing our Applicant Request Support Form or please contact 1-855-833-5120.

Criminals may pose as recruiters asking for money or personal information. We never request money or banking details from job applicants. Learn more about spotting and avoiding scams here.

Please read our Candidate Privacy Policy.

We are an equal opportunity employer: qualified applicants are considered for and treated during employment without regard to race, color, creed, religion, sex, national origin, citizenship status, disability status, protected veteran status, age, marital status, sexual orientation, gender identity, genetic information, or any other characteristic protected by law.

USA Job Seekers:

EEO Know Your Rights.

Top Skills

Agent Orchestration
Cuda
Cython
Embeddings
Multimodal Foundation Models
Python
Retrieval-Augmented Generation

Similar Jobs

18 Hours Ago
Remote or Hybrid
United States
139K-186K Annually
Senior level
139K-186K Annually
Senior level
Healthtech • Biotech
As a Senior Data Scientist, analyze healthcare data to improve health outcomes, develop reports and dashboards, and collaborate across teams to leverage genomic data for health insights.
Top Skills: Data AnalysisData VisualizationElectronic Health RecordsOmop CdmPythonSQL
Yesterday
Easy Apply
Remote
USA
Easy Apply
182K-266K Annually
Senior level
182K-266K Annually
Senior level
Automotive • Machine Learning • Robotics • Software • Transportation
The Data Scientist II will design and deploy ML models for autonomous vehicles, enhance data tagging, and improve dataset quality through advanced techniques and frameworks.
Top Skills: SparkAWSGCPPysparkPythonPyTorchTensorFlow
5 Days Ago
Remote
USA
179K-179K Annually
Senior level
179K-179K Annually
Senior level
Information Technology
As a Senior Data Scientist, you will lead high-complexity projects, develop ML and NLP solutions, and collaborate across teams to drive business impact through statistical modeling and data analysis.
Top Skills: BigQueryClickhouseDruidLlmsMachine LearningNatural Language ProcessingPower BIPythonRedshiftSQLTableau

What you need to know about the Los Angeles Tech Scene

Los Angeles is a global leader in entertainment, so it’s no surprise that many of the biggest players in streaming, digital media and game development call the city home. But the city boasts plenty of non-entertainment innovation as well, with tech companies spanning verticals like AI, fintech, e-commerce and biotech. With major universities like Caltech, UCLA, USC and the nearby UC Irvine, the city has a steady supply of top-flight tech and engineering talent — not counting the graduates flocking to Los Angeles from across the world to enjoy its beaches, culture and year-round temperate climate.

Key Facts About Los Angeles Tech

  • Number of Tech Workers: 375,800; 5.5% of overall workforce (2024 CompTIA survey)
  • Major Tech Employers: Snap, Netflix, SpaceX, Disney, Google
  • Key Industries: Artificial intelligence, adtech, media, software, game development
  • Funding Landscape: $11.6 billion in venture capital funding in 2024 (Pitchbook)
  • Notable Investors: Strong Ventures, Fifth Wall, Upfront Ventures, Mucker Capital, Kittyhawk Ventures
  • Research Centers and Universities: California Institute of Technology, UCLA, University of Southern California, UC Irvine, Pepperdine, California Institute for Immunology and Immunotherapy, Center for Quantum Science and Engineering

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account