Site Reliability Engineer

Honey

Sorry, this job was removed at 2:32 p.m. (PST) on Thursday, February 20, 2020

View 947 Jobs

Find out who's hiring in Greater LA Area.

See all Developer + Engineer jobs in Greater LA Area

View 947 Jobs

Apply

By clicking Apply Now you agree to share your profile information with the hiring company.

Save job

Honey is a fast-growing startup based in Los Angeles. Our online shopping platform offers users a smarter way to shop. We open up instant access to exclusive savings, deals, rewards and discovery, all powered by the collective knowledge of Honey's community of online shoppers. We are helping millions save when they shop online, and we're hiring!! We are actively seeking a Senior Site Reliability Engineer to join our Los Angeles team.

If you aren't already in the Los Angeles area, don't fret - we'll move you here!

As a member of our Site Reliability team, you’ll recommend and implement changes across our systems and environments, evaluate new technologies, and contribute to our technical direction. We primarily use Google Cloud Platform, Terraform, Python, Node.js, and CircleCI and have a microservice-based architecture using Docker and running on Kubernetes. We value individuals who are curious, collaborative, able to communicate effectively, and passionate about open-source software and new system architecture trends.

We’re looking for a Senior Site Reliability Engineer to design and implement infrastructure solutions to improve the scalability and efficiency of Honey’s services. The ideal candidate should possess a background in systems and / or software engineering, automation, cloud computing, and build tooling, as well as strong problem solving abilities.

About You:

Collaborative, curious, and able to communicate effectively
Experience leading teams and / or mentoring team members
Strong experience with architecture, ideally in cloud-native type environments
Production experience with major public cloud providers -- we use GCP, but experience with AWS or Azure is great
Experience managing and resolving production incidents
Containers and container orchestration (Docker, Kubernetes)
Expertise in monitoring and metrics (Datadog, Prometheus, New Relic)
Familiar with IAC / infrastructure automation (Terraform, Chef, Puppet, Ansible)
Comfort with databases and in-memory key/value stores (MySQL, Postgres, Redis, MongoDB)
Solid knowledge of Linux/UNIX and networking fundamentals

In this role you’ll:

Maintain the core infrastructure
Manage, monitor, and improve highly scalable, distributed systems to create highly available services
Collaborate with engineers in the deployment and scaling of new product features
Investigate production incidents, and help determine contributing factors / implement fixes
Identify and automate repetitive, manual tasks.
Develop effective tooling, alerts, and responses to both identify and address reliability risks
Debug software at the code and infrastructure level
Plan for the growth of Honey’s infrastructure and help define best practices
Participate in an on-call rotation

Bonus Points For:

Experience with chaos engineering and related disciplines
Experience with Golang
Previous experience with GCP
Experience with service discovery or service meshes

At Honey, we are committed to building a diverse and inclusive company. We seek to create a culture where everyone can belong because we believe that people do their best work when they can show up every day as their authentic selves. We welcome people of different backgrounds, experiences, abilities, and perspectives.

Honey is an equal opportunity employer. We do not make hiring or employment decisions on the basis of race, religion, color, national origin, gender, gender identity, sexual orientation, age, marital status, veteran status, disability status or genetic information, in compliance with applicable federal, state and local law.

Read Full Job Description

Site Reliability Engineer

Location

Similar Jobs