Data Engineer II, Infrastructure

| Hybrid
Sorry, this job was removed at 4:33 a.m. (PST) on Friday, March 26, 2021
Find out who's hiring remotely in Greater LA Area.
See all Remote Data + Analytics jobs in Greater LA Area
Apply
By clicking Apply Now you agree to share your profile information with the hiring company.

THE PURPOSE:

The Data Infrastructure Engineer works within the Data Solutions organization on critical data storage, processing, and management initiatives. They will work within the team to help design and implement systems to enable the rest of the Data Solution organization to provide data and analysis services to the rest of the broader Slickdeals organization.

They will also partner with architecture and infrastructure to enable our platform to go beyond its current data warehouse centric capabilities to also support the ingestion and analysis of increasingly larger streams of user  and system event data. The end goal of extending our data platform is to provide more connected, actionable, timely data across Slickdeals.


THE CANDIDATE:

  • Involvement in the design and implementation of storage, data streaming, and orchestration solutions in a cloud based, big data platform
  • Proven experience working with Cloud Data Lake Architectures in one or more public clouds in a Production Large Scale environment with billions of rows and petabytes of storage.
  • Proven experience with running large-scaled distributed computing infrastructure such as Spark, Hive, Presto, and Kafka
  • Design data pipelines and maintain data pipelines in cloud or on-premise environments and implement reliable, scalable, and performant distributed systems using batch or event driven architectures
  • Strong experience in AWS Cloud.  Strong knowledge of AWS data tools (EMR, Glue, Redshift, S3, Kafka)
  • Experience deploying and configuring tools such as Kafka, Spark, Presto, Airflow, MQTT, and Microservices.
  • Experience with Docker and Kubernetes and containerization & virtualization concepts
  • Configuration management using Continuous Integration, and test automation frameworks such as Jenkins with CI/CD/Git hooks and using software revision control with GitLab or Bitbucket and Jenkins.
  • Strong skills in SQL, Java and/or Python and be able to optimize code
  • Strong skills Experience with Apache Big Data Frameworks (Hadoop/EMR/Databricks, Spark, Hive) with Spark performance optimization and troubleshooting
  • Production experience setting up and using workflow scheduling/orchestration tools (Airflow, NiFi, Kubeflow, Prefect, Temporal, Rundeck, Cadence, or Jenkins)

PREFERRED EXPERIENCE:

  • Experience with Tableau, Excel, and/or other BI tools for visualization and reporting.
  • Extensive programming experience, especially in Bash, Java, Python, and/or PHP, GoLang 
  • Experience with managing distributed systems like Elasticsearch, Logstash, Kibana (ELK)
  • Experience working with and designing complex data schemas, perform data transformations, enrichments, and manipulations with efficiency and reusability in mind using well known patterns and data structures.
  • Performing SRE activities such as availability and reliability monitoring and reports.
  • Setting up infrastructure as code using tools such as Terraform, Chef/Cinc or AWS Cloudformation
  • Design and implement highly scalable infrastructure for data processing platform using and deployment of microservices on Kubernetes
  • Auto scale and monitoring performance for Kubernetes and running applications using Prometheus and Grafana or similar tools.
  • Very comfortable with SSH, bash & sh, pipes, common *nix tools and Operating Systems.
  • Implementing security authentication, authorization, and auditing of activities

To Be Successful You Will Be:

  • Highly motivated with a great attitude and desire to dive into raw data to understand trends in behavior to find insights
  • Excellent at multitasking who can execute multiple requests and reports under tight timelines
  • Inquisitive, self-starter, able to work autonomously
  • Able to work in a fast-paced dynamic startup like environment
  • Detail-oriented tactician who strives for perfection
  • Strong verbal and written communication (and listening) skills
  • Excellent reading comprehension and attention to detail.
  • Strong problem-solving skills
  • Strong documentation skills as you code (Jira, Confluence)

As a Data Infrastructure Engineer, your day-to-day tasks will include:

  • Helping us leverage large-scale data stores and data infrastructure by building out data pipelines, streams, and utilities in Spark and other technologies (Flink, Storm) for provisioning and serving data to ML/AI Models and Cloud Data Warehouse systems.
  • Developing robust, low latency and fault tolerant pipelines to support business critical systems and feedback loops.
  • Hardening and measuring fault tolerance and robustness of our data pipelines
  • Working with on-prem and cloud technologies to build and deploy your applications

Environment

Can work effectively on a small and nimble team, no trouble context-switching

Education

B.S./M.S. in Computer Science or Computer Engineering or 3+ years of equivalent experience

Read Full Job Description
Apply Now
By clicking Apply Now you agree to share your profile information with the hiring company.

Location

10990 Wilshire Blvd, Suite 1800, Westwood, CA 90024

Similar Jobs

Apply Now
By clicking Apply Now you agree to share your profile information with the hiring company.
Learn more about SlickdealsFind similar jobs