Data Engineer II, Infrastructure

Slickdeals

| Hybrid

Sorry, this job was removed at 4:33 a.m. (PST) on Friday, March 26, 2021

View 259 Jobs

Find out who's hiring remotely in Greater LA Area.

See all Remote Data + Analytics jobs in Greater LA Area

View 259 Jobs

Apply

By clicking Apply Now you agree to share your profile information with the hiring company.

Save job

THE PURPOSE:

The Data Infrastructure Engineer works within the Data Solutions organization on critical data storage, processing, and management initiatives. They will work within the team to help design and implement systems to enable the rest of the Data Solution organization to provide data and analysis services to the rest of the broader Slickdeals organization.

They will also partner with architecture and infrastructure to enable our platform to go beyond its current data warehouse centric capabilities to also support the ingestion and analysis of increasingly larger streams of user and system event data. The end goal of extending our data platform is to provide more connected, actionable, timely data across Slickdeals.

THE CANDIDATE:

Involvement in the design and implementation of storage, data streaming, and orchestration solutions in a cloud based, big data platform
Proven experience working with Cloud Data Lake Architectures in one or more public clouds in a Production Large Scale environment with billions of rows and petabytes of storage.
Proven experience with running large-scaled distributed computing infrastructure such as Spark, Hive, Presto, and Kafka
Design data pipelines and maintain data pipelines in cloud or on-premise environments and implement reliable, scalable, and performant distributed systems using batch or event driven architectures
Strong experience in AWS Cloud. Strong knowledge of AWS data tools (EMR, Glue, Redshift, S3, Kafka)
Experience deploying and configuring tools such as Kafka, Spark, Presto, Airflow, MQTT, and Microservices.
Experience with Docker and Kubernetes and containerization & virtualization concepts
Configuration management using Continuous Integration, and test automation frameworks such as Jenkins with CI/CD/Git hooks and using software revision control with GitLab or Bitbucket and Jenkins.
Strong skills in SQL, Java and/or Python and be able to optimize code
Strong skills Experience with Apache Big Data Frameworks (Hadoop/EMR/Databricks, Spark, Hive) with Spark performance optimization and troubleshooting
Production experience setting up and using workflow scheduling/orchestration tools (Airflow, NiFi, Kubeflow, Prefect, Temporal, Rundeck, Cadence, or Jenkins)

PREFERRED EXPERIENCE:

Experience with Tableau, Excel, and/or other BI tools for visualization and reporting.
Extensive programming experience, especially in Bash, Java, Python, and/or PHP, GoLang
Experience with managing distributed systems like Elasticsearch, Logstash, Kibana (ELK)
Experience working with and designing complex data schemas, perform data transformations, enrichments, and manipulations with efficiency and reusability in mind using well known patterns and data structures.
Performing SRE activities such as availability and reliability monitoring and reports.
Setting up infrastructure as code using tools such as Terraform, Chef/Cinc or AWS Cloudformation
Design and implement highly scalable infrastructure for data processing platform using and deployment of microservices on Kubernetes
Auto scale and monitoring performance for Kubernetes and running applications using Prometheus and Grafana or similar tools.
Very comfortable with SSH, bash & sh, pipes, common *nix tools and Operating Systems.
Implementing security authentication, authorization, and auditing of activities

To Be Successful You Will Be:

Highly motivated with a great attitude and desire to dive into raw data to understand trends in behavior to find insights
Excellent at multitasking who can execute multiple requests and reports under tight timelines
Inquisitive, self-starter, able to work autonomously
Able to work in a fast-paced dynamic startup like environment
Detail-oriented tactician who strives for perfection
Strong verbal and written communication (and listening) skills
Excellent reading comprehension and attention to detail.
Strong problem-solving skills
Strong documentation skills as you code (Jira, Confluence)

As a Data Infrastructure Engineer, your day-to-day tasks will include:

Helping us leverage large-scale data stores and data infrastructure by building out data pipelines, streams, and utilities in Spark and other technologies (Flink, Storm) for provisioning and serving data to ML/AI Models and Cloud Data Warehouse systems.
Developing robust, low latency and fault tolerant pipelines to support business critical systems and feedback loops.
Hardening and measuring fault tolerance and robustness of our data pipelines
Working with on-prem and cloud technologies to build and deploy your applications

Environment

Can work effectively on a small and nimble team, no trouble context-switching

Education

B.S./M.S. in Computer Science or Computer Engineering or 3+ years of equivalent experience

Read Full Job Description

Data Engineer II, Infrastructure

Location

Similar Jobs