As a Software Developer, ETL at Spokeo, you will be responsible for implementing ETL processes using a variety of data sources. This can include the effort to locate and analyze source data, create data flows to extract, profile and store ingested data, define and implement data cleansing, map data to a common schema, transform data to satisfy business rules and validate content.
Responsibilities:
- Profiling source data to assess the quality and inform business logic.
- Collaborating with the team to understand and translate business requirements into transformations.
- Collaborating with Data Engineers to optimize, automate and integrate new components into the data pipeline.
- Adhering to technical best practices for data governance, data quality, data cleansing, and other ETL related activities.
- Performing ad-hoc investigations into data anomalies as needed.
Requirements:
- Minimum of two years of development experience with Pentaho (or equivalent tools such as Talend, DataStage, and Informatica).
- Advanced SQL coding skills for data transformations, profiling, and query tasks.
- Experience in agile environments such as scrum and Kanban.
- Preference for open source big data skills in tools such as Hive, HBase, Parquet, Spark.
- B.S. preferred in Computer Science, Information Systems, or related field