Sayari Logo

Sayari

Data Engineering Intern

Reposted 2 Days Ago
Remote
Hiring Remotely in United States
Internship
Remote
Hiring Remotely in United States
Internship
Assist the Data Engineering team in collecting global data, maintaining ETL pipelines, and developing new ones for Sayari Graph.
The summary above was generated by AI

About Sayari: 

Sayari is the counterparty and supply chain risk intelligence provider trusted by government agencies, multinational corporations, and financial institutions. Its intuitive network analysis platform surfaces hidden risk through integrated corporate ownership, supply chain, trade transaction and risk intelligence data from over 250 jurisdictions. Sayari is headquartered in Washington, D.C., and its solutions are used by thousands of frontline analysts in over 35 countries.


Our company culture is defined by a dedication to our mission of using open data to enhance visibility into global commercial and financial networks, a passion for finding novel approaches to complex problems, and an understanding that diverse perspectives create optimal outcomes. We embrace cross-team collaboration, encourage training and learning opportunities, and reward initiative and innovation. If you like working with supportive, high-performing, and curious teams, Sayari is the place for you.


Internship Description:

Sayari is looking for an intern to join its Data Engineering team! Sayari’s flagship product, Sayari Graph, provides instant access to structured business information from billions of corporate, legal, and trade records. As a member of Sayari's data team you will work with our Product and Software Engineering teams to collect data from around the globe, maintain existing ETL pipelines, and develop new pipelines that power Sayari Graph.


Our application tier is built primarily in TypeScript, running in Kubernetes, and backed by Postgres, Cassandra, Elasticsearch, and Memgraph. Our data ingest tier runs on Spark, processing terabytes of data collected from hundreds of data sources. The platform allows users to explore a large knowledge graph sourced from hundreds of millions of structured and unstructured records from over 200 countries and 30 languages. As part of this team, you'll have the chance to contribute to our growing library of open-source work, including our WebGL-powered network visualization library Trellis.


This is a remote paid internship with work expectations being between 20-30 hours a week.

Job Responsibilities:

  • Write and deploy crawling scripts to collect source data from the web
  • Write and run data transformers in Scala Spark to standardize bulk data sets
  • Write and run modules in Python to parse entity references and relationships from source data
  • Diagnose and fix bugs reported by internal and external users
  • Analyze and report on internal datasets to answer questions and inform feature work
  • Work collaboratively on and across a team of engineers using basic agile principles
  • Give and receive feedback through code reviews

Required Skills & Experience:

  • Experience with Python and/or a JVM language (e.g., Scala)
  • Experience working collaboratively with git

Desired Skills & Experience:

  • Experience with Apache Spark and Apache Airflow
  • Experience working on a cloud platform like GCP, AWS, or Azure
  • Understanding of or interest in knowledge graphs

What We Offer: 

·       A collaborative and positive culture - your team will be as smart and driven as you

·       Limitless growth and learning opportunities

·       A strong commitment to diversity, equity, and inclusion

·       Team building events & opportunities

 

Sayari is an equal opportunity employer and strongly encourages diverse candidates to apply. We believe diversity and inclusion mean our team members should reflect the diversity of the United States. No employee or applicant will face discrimination or harassment based on race, color, ethnicity, religion, age, gender, gender identity or expression, sexual orientation, disability status, veteran status, genetics, or political affiliation. We strongly encourage applicants of all backgrounds to apply.

Top Skills

Cassandra
Elasticsearch
Kubernetes
Memgraph
Postgres
Spark
Typescript

Similar Jobs

4 Hours Ago
Easy Apply
Remote
US
Easy Apply
165K-205K
Senior level
165K-205K
Senior level
Marketing Tech • Social Media • Software • Analytics • Business Intelligence
Lead development of statistical models and data analysis to support go-to-market strategies, translating data insights into actionable strategies for stakeholders.
Top Skills: AWSAzureGCPPythonSQLTableau
5 Hours Ago
Easy Apply
Remote
United States
Easy Apply
160K-180K Annually
Expert/Leader
160K-180K Annually
Expert/Leader
Healthtech • Software
As a Senior Data Architect, you will provide technical leadership in data architecture, enhance data governance, ensure high performance, and facilitate cross-functional collaboration.
Top Skills: Aws S3AzureCircleCIDatabricksDbtDockerGcp BigqueryGitGithub ActionJavaKubernetesLookerMemcachedPower BIPulumiPythonRedisSnowflakeSQLTerraform
5 Hours Ago
Remote
Hybrid
8 Locations
153K-270K Annually
Senior level
153K-270K Annually
Senior level
Blockchain • Fintech • Mobile • Payments • Software • Financial Services
The Machine Learning Engineering Manager will lead the Personalization ML Engineering team at Cash App, overseeing all engineering work related to personalized recommendations and AI solutions. Responsibilities include managing and mentoring team members, collaborating with product and business leaders, and ensuring the deployment of impactful ML solutions. They will also contribute to strategic planning and stay informed on the latest AI techniques.

What you need to know about the Los Angeles Tech Scene

Los Angeles is a global leader in entertainment, so it’s no surprise that many of the biggest players in streaming, digital media and game development call the city home. But the city boasts plenty of non-entertainment innovation as well, with tech companies spanning verticals like AI, fintech, e-commerce and biotech. With major universities like Caltech, UCLA, USC and the nearby UC Irvine, the city has a steady supply of top-flight tech and engineering talent — not counting the graduates flocking to Los Angeles from across the world to enjoy its beaches, culture and year-round temperate climate.

Key Facts About Los Angeles Tech

  • Number of Tech Workers: 375,800; 5.5% of overall workforce (2024 CompTIA survey)
  • Major Tech Employers: Snap, Netflix, SpaceX, Disney, Google
  • Key Industries: Artificial intelligence, adtech, media, software, game development
  • Funding Landscape: $11.6 billion in venture capital funding in 2024 (Pitchbook)
  • Notable Investors: Strong Ventures, Fifth Wall, Upfront Ventures, Mucker Capital, Kittyhawk Ventures
  • Research Centers and Universities: California Institute of Technology, UCLA, University of Southern California, UC Irvine, Pepperdine, California Institute for Immunology and Immunotherapy, Center for Quantum Science and Engineering

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account