Exactera Logo

Exactera

Principal Data Platform Engineer

Reposted 8 Days Ago
Remote
Hiring Remotely in USA
Senior level
Remote
Hiring Remotely in USA
Senior level
The Lead Data Platform Engineer will architect a centralized data platform on Databricks, focusing on data governance, performance optimization, and enabling data engineers. Responsibilities include designing data models, implementing ETL pipelines, and managing third-party integrations.
The summary above was generated by AI

Exactera has offices in New York City, Tarrytown NY, San Diego, CA, London, and Argentina. 

The Role

As Principal Data Platform Engineer, you'll architect and implement our centralized data platform on Databricks. You'll establish governance patterns using Unity Catalog, optimize for cost and performance at scale, and enable our existing Data Engineers to build confidently on the platform. This is a data infrastructure role—focused on pipelines, storage, governance, and platform operations. 


The Business Challenge

We operate multiple product lines (Transfer Pricing, R&D Services, RoyaltyStat, Provisioning), each with distinct databases containing enterprise financial data—journal entries, general ledgers, and financial statements. Our immediate challenge is migrating multi-terabyte datasets from legacy systems to a unified Databricks lakehouse while establishing governance patterns that enable multi-product operations at scale.


What You'll Build
  • Data Structuring: Design data models and implement unified schemas across multiple disparate product lines.
  • Unity Catalog Architecture: Design and implement multi-catalog governance strategy supporting data isolation, cross-product data sharing, and comprehensive lineage tracking across our product portfolio
  • Delta Lake Optimization: Establish patterns for Z-ordering, compaction, and liquid clustering at multi-TB scale. Define table structures, partitioning strategies, and retention policies that balance query performance with storage costs
  • ETL Pipeline Framework: Build declarative pipeline patterns using Delta Live Tables. Create orchestration workflows for ingesting data from internal sources such as SQL databases and S3
  • Third Party Integrations: Integrate with third party data sources such as ERP systems (Netsuite etc.) and external data providers (S&P etc.) with automated ingest, robust error handling and monitoring.
  • Platform Operations: Implement cost monitoring and optimization strategies, establish data quality frameworks, create self-service patterns enabling Data Engineers to work independently while maintaining governance standards

Business Problems You'll Solve
  • Key Legacy Product Migrations: Lead the architecture for migrating multi-terabyte datasets from legacy systems to Databricks—establishing patterns that will be reused across multiple product lines
  • Multi-Product Data Architecture: Design Unity Catalog structures enabling secure data separation between product lines while allowing controlled cross-product analytics where appropriate
  • Cost-Efficient Scale: Build infrastructure that scales efficiently—through intelligent caching, query optimization, and compute management strategies that avoid linear cost growth
  • Platform Reliability: Establish monitoring, alerting, and data quality validation ensuring the platform operates reliably as foundation for both analytics and AI workloads

Required Experience

Databricks Expertise (Required)

  • Unity Catalog: Production experience with multi-catalog governance, metastore design, and lineage tracking.
  • Data Structuring: Experience designing and building unified schemas across multiple disparate product lines.
  • Delta Lake: Expert-level experience with Z-ordering, compaction, liquid clustering, and performance tuning at multi-TB scale
  • Delta Live Tables: Strong hands-on experience building declarative ETL pipelines, including change data capture and expectations/constraints
  • Databricks Workflows: Experience with job orchestration, scheduling, and operational monitoring
  • Business Intelligence: Experience enabling company-wide analytics and reporting with modern business intelligence tools and maintaining source of truth data and metrics.
  • PySpark & Databricks SQL: Strong proficiency for code review, performance tuning, and query optimization

Core Platform Engineering

  • 5-8 years in data engineering or data platform roles, with 3+ years hands-on Databricks experience
  • Track record leading at least one significant platform build or migration project
  • AWS experience (S3, IAM, VPC) with ability to collaborate on infrastructure decisions
  • Infrastructure-as-code experience (Terraform preferred)

Technical Leadership

  • Demonstrated ability architecting data platforms from first principles and defending technical decisions
  • Strong written and verbal communication— document architecture decisions and present to both technical and business stakeholders

Preferred But Not Required

  • Experience with financial data, accounting systems (NetSuite), or enterprise ERP platforms
  • Background building platforms that serve AI/ML workloads (experience preparing data for downstream ML consumption, RAG and retrieval, and LLMs.
  • Understand advanced intelligence concepts such as relationship surfacing with knowledge graphs
  • Familiarity with data governance frameworks and compliance requirements for regulated industries



What We Offer:
(The following only applies to US-based positions)
  • A collaborative team culture with opportunities for career development. 
  • Ample opportunities to be recognized, build valuable skills, and grow your career. 
  • Generous vacation policy, including paid parental leave. 
  • Comprehensive health plans with FSA and HSA options. 
  • 401(k) retirement plan. 
  • Life and disability insurance coverage. 
  • Supplemental benefits like a dependent care savings plan, pet insurance, will preparation, and an employee assistance program. 

About Us:
At Exactera, a FinTech SaaS start-up founded in 2016, we stand at the intersection of human and machine intelligence. Our corporate tax solutions are powered by AI and cloud-based technologies, serving customers worldwide. With over $100 million in funding from Savant Venture Fund and Insight Partners, we are poised for growth. We are committed to diversity, inclusion, and equal opportunities for all. 

Similar Jobs

7 Hours Ago
Remote
United States of America
150K-327K Annually
Senior level
150K-327K Annually
Senior level
AdTech • Digital Media • Information Technology • Other
Lead architecture and implementation of a large-scale analytics stack: design foundational pipelines, establish modeling/orchestration standards, integrate AI-assisted engineering tools, translate business questions into version-controlled data models, partner with stakeholders to define metrics and data contracts, migrate logic into maintainable pipelines, and optimize cloud warehouse performance for scale.
Top Skills: BigQueryClaude CodeCloud Data WarehouseCodexCursorGa4Pipeline OrchestrationPythonSQL
12 Days Ago
Remote
US
170K-195K Annually
Expert/Leader
170K-195K Annually
Expert/Leader
Healthtech
Lead technical strategy and roadmap for an enterprise Databricks-based data platform. Drive governance, metadata, lineage, and semantic layer standards; build reusable data products; enable self-service analytics and scalable ML operations; mentor engineers and align platform investments with business outcomes.
Top Skills: Ai/Bi GenieDatabricksDatabricks WorkflowsDelta LakeMlflowPysparkPythonSparkSQLUnity CatalogVector Search
15 Days Ago
Remote
USA
170K-235K Annually
Senior level
170K-235K Annually
Senior level
Artificial Intelligence • Fintech • Insurance • Real Estate
Lead design and delivery of a cloud-based data platform: implement CDC/streaming ingestion, modeling, transformations, cross-region DB replication, observability, governance (PII, lineage, access), and enable downstream marts while mentoring engineers and driving IaC/automation-first practices.
Top Skills: AirbyteAirflowArgocdAvroAWSAzureAzure Key VaultCollibraDagsterDatadogDatahubDbtDeltaFlinkFluxGCPGithub ActionsGitlab CiGoGrafanaGreat ExpectationsHclHudiIcebergJSONJson SchemaKafkaMicrosoft PurviewModel Context Protocol (Mcp)PostgresPrefectPrometheusProtobufPythonSnowflakeSodaSQLTerraformYaml

What you need to know about the Los Angeles Tech Scene

Los Angeles is a global leader in entertainment, so it’s no surprise that many of the biggest players in streaming, digital media and game development call the city home. But the city boasts plenty of non-entertainment innovation as well, with tech companies spanning verticals like AI, fintech, e-commerce and biotech. With major universities like Caltech, UCLA, USC and the nearby UC Irvine, the city has a steady supply of top-flight tech and engineering talent — not counting the graduates flocking to Los Angeles from across the world to enjoy its beaches, culture and year-round temperate climate.

Key Facts About Los Angeles Tech

  • Number of Tech Workers: 375,800; 5.5% of overall workforce (2024 CompTIA survey)
  • Major Tech Employers: Snap, Netflix, SpaceX, Disney, Google
  • Key Industries: Artificial intelligence, adtech, media, software, game development
  • Funding Landscape: $11.6 billion in venture capital funding in 2024 (Pitchbook)
  • Notable Investors: Strong Ventures, Fifth Wall, Upfront Ventures, Mucker Capital, Kittyhawk Ventures
  • Research Centers and Universities: California Institute of Technology, UCLA, University of Southern California, UC Irvine, Pepperdine, California Institute for Immunology and Immunotherapy, Center for Quantum Science and Engineering

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account