Float.com Logo

Float.com

Site Reliabilty Engineer

Posted 7 Days Ago
Remote
Hiring Remotely in United States
133K-133K
Mid level
Remote
Hiring Remotely in United States
133K-133K
Mid level
The Site Reliability Engineer at Float will manage Kubernetes infrastructure, optimize services, develop incident response playbooks, and support data layers to ensure reliability and efficiency for customers.
The summary above was generated by AI
Description
Who We Are

Float is the leading resource management software for professional services teams. Since 2012, we’ve grown every year—independently, self-funded, and profitably. We’re rated #1 for resource management on G2 and trusted by 4,500+ customers worldwide.

As a certified B Corporation, we’re committed to making a positive impact on our team, customers, the environment, and the remote community. Our 50+ person team works 100% remotely across the globe, with designed to support us in living our Best Work Life. You'll collaborate with teammates across Australia, Mexico, the UK, Nigeria, Canada, and the US. Learn more about our data security practices for employment or service contracts . Browse to get a glimpse of life at Float and check out our . See why our customers love Float .

We’re on a scale-up journey, and we’re seeking people who thrive in this stage. We want Float to be the place where you have the autonomy and opportunity to do the best work of your career.

Why We’re Hiring For This Role

Float’s infrastructure has grown rapidly, meaning more customers, more complex systems, and more opportunities to build for scale. As the scale of our systems increases, we’re growing our SRE team to match. You’ll be the third site reliability engineer, and will be working alongside our QA team. This role is about stepping into a high-impact space: helping us automate smarter, improve visibility across engineering, and ensure reliability as we scale. You’ll join a team that’s laying the groundwork for stronger SLAs and an even better experience for our customers.

This role will report into Chris, our Team Lead for SRE & QA. Check out this video where he explains the important role you will play within our SRE team.

You’ll be working asynchronously with a bright, dedicated team from across the globe, with a strong focus on taking complex problems and creating solutions that feel simple and intuitive for our customers.

What You’ll Be Responsible For

Early on, you’ll jump right into:

  • Upgrade paths: Maintain and validate the processes that keep our Kubernetes infrastructure up-to-date, ensuring upgrades happen smoothly, safely, and regularly.
  • Service hygiene: Remove noisy, unused, or misfiring boot alerts and improve the team's ability to trust alerts as meaningful signals.
  • Service integration: Partner with engineers to configure services within our clusters and support service migrations where possible.
  • Kubernetes optimisation: Review and optimise usage across Kubernetes services, including right-sizing scale node specifications.

Once you are a bit more settled, we expect that you will jump into the following projects:

  • Service mesh & ingress security: Lead our exploration and implementation of service mesh options and harden ingress layers to defend against spam and abuse.
  • Incident response playbooks: Define and roll out standardised playbooks to improve clarity and speed during production incidents.
  • CDC layer support: Build deep familiarity with our next-gen data layer (CDC) to support new teams building on top of it.
  • SLO coaching & support: Help teams define, measure, and meet reliability goals—enabling engineering to own quality into production and drive better outcomes for customers.
What You’ll Need To Be Successful

We want you to love your work and believe that these skills will allow you to succeed in the role. Applying these skills requires:

  • Bash + programming language: Confident writing scripts in Bash and proficient in at least one go-to language (ideally PHP, NodeJS, or Python).
  • Kubernetes: Strong production experience managing and optimising Kubernetes clusters.
  • Terraform: Solid understanding of infrastructure as code using Terraform.
  • GCP: Familiarity with Google Cloud Platform, or eagerness to get up to speed quickly.
  • Iteration mindset: You believe in shipping value early and improving over time, not chasing one-shot perfection.
  • Written communication: You write clearly and concisely, whether it's documenting infrastructure, proposing changes, or sharing learnings across teams.
  • Timezone Preference: We’re ideally looking for someone based between UTC -5 and UTC +3 so there’s good overlap with the rest of the team for hands-on support.

Our details the key competencies and expectations needed for this role. Take a look at the Level 2 column to learn more about what you’ll need to be successful in the role, in addition to the technical skills outlined above.

As a fully remote team, we’re looking for someone comfortable with asynchronous communication as the default, which means you have previous remote experience and are comfortable using tools like Slack, Loom, and Linear to communicate as needed. Don’t worry—you will have significant deep work time since we have .

Why Join Us

Pay for this role is US $133,000 (Level 2). Here’s a with more information on how we determine our salaries.

We’re a global with a diverse team of people from all over the world who share a common belief in living our . We believe deeply in the idea of transparency and share our publicly so potential new team members can see first hand our as well as our . If you feel like you can thrive at Float to do your best work, we would love to hear from you.

Hiring Process For This Role

You’ll find a lot of useful information about our and what it’s like to join our global team on the . By the way, we made a blog post on - we highly recommend you check it out prior to applying!

The hiring process for this role looks like this:

Initial First Meet (20 min): You'll meet with Julia, our Talent Manager, to discuss your interest in the role and review your questions about working at Float.

Manager Interview (45 min): You’ll meet with Chris, our SRE Team Lead , to discuss how your background and experience make you a great fit for this role.

Co-Worker Interview (30 min): You’ll meet with Bogdan, our Site Reliability Engineer, to dive deeper into your goals and to learn more about your alignment with our values and ways of working.

Take-home assignment (2 hours, paid): You’ll complete a take-home technical assignment that the hiring team will review. You will be paid an honorarium after completion of your take-home assignment, and will receive feedback on your assignment regardless of the outcome.

Founder Interview (30 min): You’ll meet with Lars, our CTO and Co-Founder, to get to know you and see if you have potential to be a great addition to the team.

Note: Industry research shows that women and those in traditionally underrepresented groups generally don’t apply to jobs unless they check all the boxes for the role. If you feel strongly that you have what it takes for this role but don’t check 100% of the boxes—that’s okay—we encourage you to apply anyway and highlight what you can bring to the table.

Top Skills

Bash
Google Cloud Platform
Kubernetes
Node.js
PHP
Python
Terraform

Similar Jobs

2 Hours Ago
Remote or Hybrid
New York, NY, USA
208K-312K Annually
Senior level
208K-312K Annually
Senior level
eCommerce • Fintech • Hardware • Payments • Software • Financial Services
This role involves being a strategic finance partner to marketing, leading a team, developing financial models, and conducting strategic analyses.
Top Skills: AIExcelSQL
3 Hours Ago
Remote or Hybrid
Austin, TX, USA
Junior
Junior
Artificial Intelligence • Cloud • Information Technology • Sales • Security • Software • Cybersecurity
The Sales Development Representative identifies and qualifies leads, engages prospects, and manages the sales pipeline to help achieve revenue goals.
Top Skills: GongLeadiqLinkedin SalesnavigatorSalesforceSalesloft
3 Hours Ago
Easy Apply
Remote
North Dakota, USA
Easy Apply
170K-170K
Mid level
170K-170K
Mid level
Greentech • Hardware • Internet of Things • Machine Learning • Software • Business Intelligence • Agriculture
As a Regional Sales Lead, you will manage a sales territory, lead and coach a team of Territory Managers, drive sales execution, and ensure customer satisfaction in agricultural technology using virtual fencing solutions.

What you need to know about the Los Angeles Tech Scene

Los Angeles is a global leader in entertainment, so it’s no surprise that many of the biggest players in streaming, digital media and game development call the city home. But the city boasts plenty of non-entertainment innovation as well, with tech companies spanning verticals like AI, fintech, e-commerce and biotech. With major universities like Caltech, UCLA, USC and the nearby UC Irvine, the city has a steady supply of top-flight tech and engineering talent — not counting the graduates flocking to Los Angeles from across the world to enjoy its beaches, culture and year-round temperate climate.

Key Facts About Los Angeles Tech

  • Number of Tech Workers: 375,800; 5.5% of overall workforce (2024 CompTIA survey)
  • Major Tech Employers: Snap, Netflix, SpaceX, Disney, Google
  • Key Industries: Artificial intelligence, adtech, media, software, game development
  • Funding Landscape: $11.6 billion in venture capital funding in 2024 (Pitchbook)
  • Notable Investors: Strong Ventures, Fifth Wall, Upfront Ventures, Mucker Capital, Kittyhawk Ventures
  • Research Centers and Universities: California Institute of Technology, UCLA, University of Southern California, UC Irvine, Pepperdine, California Institute for Immunology and Immunotherapy, Center for Quantum Science and Engineering

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account