Site Reliability Engineer
Honey is a fast-growing startup based in Los Angeles. Our online shopping platform offers users a smarter way to shop. We open up instant access to exclusive savings, deals, rewards and discovery, all powered by the collective knowledge of Honey's community of online shoppers. We are helping millions save when they shop online, and we're hiring!! We are actively seeking a Senior Site Reliability Engineer to join our Los Angeles team.
If you aren't already in the Los Angeles area, don't fret - we'll move you here!
As a member of our Site Reliability team, you’ll recommend and implement changes across our systems and environments, evaluate new technologies, and contribute to our technical direction. We primarily use Google Cloud Platform, Terraform, Python, Node.js, and CircleCI and have a microservice-based architecture using Docker and running on Kubernetes. We value individuals who are curious, collaborative, able to communicate effectively, and passionate about open-source software and new system architecture trends.
We’re looking for a Senior Site Reliability Engineer to design and implement infrastructure solutions to improve the scalability and efficiency of Honey’s services. The ideal candidate should possess a background in systems and / or software engineering, automation, cloud computing, and build tooling, as well as strong problem solving abilities.
- Collaborative, curious, and able to communicate effectively
- Experience leading teams and / or mentoring team members
- Strong experience with architecture, ideally in cloud-native type environments
- Production experience with major public cloud providers -- we use GCP, but experience with AWS or Azure is great
- Experience managing and resolving production incidents
- Containers and container orchestration (Docker, Kubernetes)
- Expertise in monitoring and metrics (Datadog, Prometheus, New Relic)
- Familiar with IAC / infrastructure automation (Terraform, Chef, Puppet, Ansible)
- Comfort with databases and in-memory key/value stores (MySQL, Postgres, Redis, MongoDB)
- Solid knowledge of Linux/UNIX and networking fundamentals
In this role you’ll:
- Maintain the core infrastructure
- Manage, monitor, and improve highly scalable, distributed systems to create highly available services
- Collaborate with engineers in the deployment and scaling of new product features
- Investigate production incidents, and help determine contributing factors / implement fixes
- Identify and automate repetitive, manual tasks.
- Develop effective tooling, alerts, and responses to both identify and address reliability risks
- Debug software at the code and infrastructure level
- Plan for the growth of Honey’s infrastructure and help define best practices
- Participate in an on-call rotation
Bonus Points For:
- Experience with chaos engineering and related disciplines
- Experience with Golang
- Previous experience with GCP
- Experience with service discovery or service meshes
At Honey, we are committed to building a diverse and inclusive company. We seek to create a culture where everyone can belong because we believe that people do their best work when they can show up every day as their authentic selves. We welcome people of different backgrounds, experiences, abilities, and perspectives.
Honey is an equal opportunity employer. We do not make hiring or employment decisions on the basis of race, religion, color, national origin, gender, gender identity, sexual orientation, age, marital status, veteran status, disability status or genetic information, in compliance with applicable federal, state and local law.