DataPelago Logo

DataPelago

Principal Data Processing Software Engineer-OSS

Posted 5 Days Ago
Be an Early Applicant
In-Office
Mountain View, CA
6-6 Annually
Senior level
In-Office
Mountain View, CA
6-6 Annually
Senior level
As a Principal Data Processing Engineer, you will enhance a data processing engine, collaborate on open-source platforms, and optimize performance.
The summary above was generated by AI

Principal Data Processing Engineer - OSS
Mountain View, CA 

About DataPelago:

DataPelago is at the forefront of revolutionizing data processing for traditional analytics and cutting-edge GenAI preprocessing. We are building an innovative data processing engine that is transforming how Apache Spark, Apache Flink, Ray, and others operate on diverse, large-scale data. Our team of engineers drive and adopt advances in hardware-accelerated computing, parallel processing of large-scale data, query optimization, distributed systems, compilers, machine learning, and cloud-native computing. We are looking for world-class engineers to join our team and shape the future of accelerated data processing.

The Role:

As a Principal Data Processing Engineer (OSS), you will be a key individual contributor in
adopting and advancing the capabilities of open-source software (OSS) platforms such as Apache

Gluten, Velox, Apache Spark, and Apache Flink in the context of DataPelago’s data processing engine. You will enhance the functional breadth, performance, scale, and reliability of the DataPelago engine through downstream and upstream contributions. You will have the opportunity to engage with the community working on these platforms. This is a unique opportunity to make a significant impact on a category-defining product and work with a talented team of engineers.

What You'll Do:

  • Influence the architecture of how our data processing engine interfaces with open-source platforms and engines.
  • Lead the design of functional and performance enhancements to open source platforms such as Apache Gluten and Velox, and their integration with our data processing engine.
  • Individually design, implement, test, optimize, and maintain components of the data processing engine.
  • Analyze the technology roadmap of Apache Gluten, Velox, and equivalent platforms and identify opportunities for our engine to enhance technology and product leadership.
  • Collaboration: Partner with engineering, product management, the open-source community and customer success teams.
  • Foster best practices in design and code reviews, testing, CI/CD, and issue resolution to maintain the highest product quality, security, efficiency, and productivity.

What You'll Bring:

  • BS/MS in  Computer Science (or a related field) with 6+ years of relevant experience 
  • 3+ years of deep technical experience in instrumenting, analyzing, and optimizing the performance of data processing engine components on benchmark and customer workloads.
  • Sound knowledge of the architecture and internal operation of one or more of Apache Spark,
    Apache Flink, Presto/Trino.
  • Demonstrated experience in the design, development, and successful release of high-performance data processing engines for large production deployments.
  • Exceptional programming skills in C, C++, and Java.
  • Extensive development experience in Linux environments.
  • Excellent communication and collaboration skills, with the ability to articulate complex technical concepts to both technical and non-technical audiences.
  • Strong analytical and problem-solving skills with a passion for performance optimization.

Location Considerations:

We value face-to-face collaboration, but recognize that talent can be found anywhere. Our engineering team works at our headquarters in Mountain View, CA, at our India office in Hyderabad, and at remote locations.

Why Join DataPelago?

  • Technical Leadership: Take a leadership role in shaping the architecture and development of how our core engine works with open source data processing platforms
  • Cutting-Edge Innovation: Work on challenging problems at the forefront of accelerated
    computing and data processing.
  • Significant Impact: Your contributions will directly impact the performance and scalability of our mission-critical platform.
  • Mentorship and Growth: Mentor and guide other talented engineers while expanding your own technical expertise.
  • Competitive compensation, stock options, comprehensive benefits package, and leadership development opportunities

Top Skills

Apache Flink
Spark
C
C++
Java
Linux

Similar Jobs

16 Minutes Ago
In-Office
San Mateo, CA, USA
200K-315K Annually
Senior level
200K-315K Annually
Senior level
Cloud • Hardware • Security • Software
Lead the security team to develop and implement security and privacy capabilities for Verkada's Command Platform, enhancing product security and compliance.
Top Skills: AICloud SecurityDistributed Systems
33 Minutes Ago
Easy Apply
In-Office
San Francisco, CA, USA
Easy Apply
100K-140K Annually
Mid level
100K-140K Annually
Mid level
eCommerce • Food • HR Tech • Information Technology • Mobile • Retail • Software
The AI and Analytics Associate will automate workflows, develop data-driven strategies, and track KPIs to improve GTM performance while collaborating with various teams.
Top Skills: Ai ToolsAutomation PlatformsBi ToolsExcelGoogle SheetsNo-Code ToolsPythonSQL
36 Minutes Ago
Easy Apply
In-Office
Los Angeles, CA, USA
Easy Apply
180K-200K Annually
Senior level
180K-200K Annually
Senior level
Artificial Intelligence • Computer Vision • Machine Learning • Payments • Real Estate • PropTech
As a Senior Software Engineer, you will architect and build the Metropolis Web Services, focusing on distributed systems, microservices, and backend functionality, while mentoring other engineers and ensuring high system reliability.
Top Skills: AWSDatadogGitGitJavaKafkaKubernetesMySQLPostgresReactScalaSnowflakeSqsTypescript

What you need to know about the Los Angeles Tech Scene

Los Angeles is a global leader in entertainment, so it’s no surprise that many of the biggest players in streaming, digital media and game development call the city home. But the city boasts plenty of non-entertainment innovation as well, with tech companies spanning verticals like AI, fintech, e-commerce and biotech. With major universities like Caltech, UCLA, USC and the nearby UC Irvine, the city has a steady supply of top-flight tech and engineering talent — not counting the graduates flocking to Los Angeles from across the world to enjoy its beaches, culture and year-round temperate climate.

Key Facts About Los Angeles Tech

  • Number of Tech Workers: 375,800; 5.5% of overall workforce (2024 CompTIA survey)
  • Major Tech Employers: Snap, Netflix, SpaceX, Disney, Google
  • Key Industries: Artificial intelligence, adtech, media, software, game development
  • Funding Landscape: $11.6 billion in venture capital funding in 2024 (Pitchbook)
  • Notable Investors: Strong Ventures, Fifth Wall, Upfront Ventures, Mucker Capital, Kittyhawk Ventures
  • Research Centers and Universities: California Institute of Technology, UCLA, University of Southern California, UC Irvine, Pepperdine, California Institute for Immunology and Immunotherapy, Center for Quantum Science and Engineering

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account