Xero Logo

Xero

Site Reliability Engineer - Chaos Engineering

Posted 14 Days Ago
Hybrid
San Mateo, CA
Mid level
Hybrid
San Mateo, CA
Mid level
The role involves enhancing system reliability through Chaos Engineering practices, automating processes, and integrating tools for better data insights.
The summary above was generated by AI

Our Purpose 

At Xero, we’re here to help you supercharge your business. We do this by automating routine tasks, surfacing actionable insights and connecting businesses with the right data, advisors and apps. When that happens, we’re not only making life better for small business, we’ll be building a stronger economy that can change the world.


Xero is a beautiful, easy-to-use platform that helps small businesses and their accounting and bookkeeping advisors grow and thriveAt Xero, our purpose is to make life better for people in small business, their advisors, and communities around the world. This purpose sits at the very center of everything we do. We support our people to do the best work of their lives so they can help small businesses succeed through better tools, information and connections. Because when they succeed they make a difference, and when millions of small businesses are making a difference, the world is a more beautiful place.

What You Will Do:

  • The Xero Chaos Engineering Team is a part of the Site Reliability Engineering organization and is responsible for constantly tuning the operational readiness and efficiency of Xero services.The team is responsible for driving enduring reliability at Xero and is focused on improving system resilience by intentionally introducing controlled disruptions of failures into a system to identify weaknesses and vulnerabilities in both pre-production and production environments.  The goal is to identify weaknesses before they become outages.

What You'll Bring With You:

  • The Chaos Engineering Team is responsible for designing and implementing chaos experiments to identify weaknesses in system architecture and improve overall reliability.  The role involves collaborating with cross-functional teams to develop strategies that enhance system resilience and ensure optimal performance in production environments.
  • You will design and build a failure mode and chaos engineering environment that allows for repeatable and scalable testing to be carried out within Xero. You will design and execute chaos experiments to simulate various failure scenarios.
  • Develop and maintain chaos engineering frameworks and tools
  • Collaborate with development and operations teams to implement improvements based on experiment results.
  • Monitor system health and performance metrics to assess the impact of chaos experiments.
  • Educate team members on chaos engineering principles and best practices.
  • Analyze system behavior during experiments and document findings. Continuously improve chaos engineering process and methodologies.
  • Proficient in programming languages such as Python, Go, Java, C#, C+, .NET for automation and tool development
  • Experienced in using chaos engineering tools like Gremlin, Chaos Monkey or Litmus.
  • Excellent analytical skills to assess system performance and identify weaknesses.
  • Effective communication skills to collaborate with cross-functional teams and convey complex concepts.
  • Leadership abilities to drive chaos engineering initiatives and foster a culture of resilience
  • Knowledge of cloud platforms (e.g., AWS, Azure, GCP) and container orchestration (e.g., Kubernetes).
  • Familiarity with monitoring and observability tools to track system health and performance metrics.

Why Xero?

  • Offering very generous paid leave to use however you’d like (plus statutory holidays!), dedicated paid leave to care for your physical and mental wellbeing as well as an Employee Assistance Program to access mental health care for you and your family, health insurance, life insurance, and income protection, wellbeing and sports programmes, employee resource groups 26 weeks of paid parental leave for primary caregivers, an Employee Share Plan, beautiful offices, flexible working, career development, and many other benefits that reflect our human value, you’ll do the best work of your life at Xero.

Why Xero? 

Diversity of people brings diversity of thought, and we like that. Our human-first culture of respect, fairness, and inclusion is what helps Xeros thrive and work and beyond. Offering very generous paid leave to use however you’d like (plus statutory holidays!), dedicated paid leave to care for your physical and mental wellbeing as well as an Employee Assistance Program to access mental health care for you and your family, employee resource groups, wellbeing programming and allowances, medical, dental, vision, and disability insurance, fertility and family forming financial support, 401k contribution matching, 26 weeks of paid parental leave for primary caregivers, an Employee Share Plan, beautiful offices with snacks and break areas, flexible working, career development and many other benefits that reflect our human value, you’ll do the best work of your life at Xero.


Research has shown that women and underrepresented groups are less likely to apply to jobs unless they meet every single competency or experience. If you are excited about this role, but your past experience doesn't align perfectly, we encourage you to apply anyway. You could be just the right person for this role and Xero. If you have any support or access requirements, we encourage you to advise us at time of application and throughout the interview process.

Top Skills

Chaos Engineering
Site Reliability Engineering

Similar Jobs at Xero

5 Days Ago
Hybrid
San Mateo, CA, USA
Mid level
Mid level
Cloud • Fintech • Information Technology • Machine Learning • Software
The role involves implementing chaos engineering experiments to enhance system resilience, collaborating with teams to optimize performance, and maintaining engineering frameworks.
Top Skills: .NetAWSAzureC#C++Chaos MonkeyGCPGoGremlinJavaKubernetesLitmusPython
3 Days Ago
Hybrid
San Mateo, CA, USA
Senior level
Senior level
Cloud • Fintech • Information Technology • Machine Learning • Software
Responsible for incident management and reliability for Xero. Lead critical outage responses, develop scalable processes, and enhance service reliability through cross-team collaboration and ongoing training.
Top Skills: AWSBgpDnssecIpsecPythonSsl/TlsTcp/Ip
5 Days Ago
Hybrid
San Mateo, CA, USA
Mid level
Mid level
Cloud • Fintech • Information Technology • Machine Learning • Software
As an Engineer in SRE Observability, you will enhance operational excellence, build monitoring tools, and support teams to improve software reliability.
Top Skills: C#CicdDatadogDockerDynatraceGoIacJavaScriptKubernetesLinuxNew RelicOpen TelemetryPythonScalyrSignalfxSplunkSumo Logic

What you need to know about the Los Angeles Tech Scene

Los Angeles is a global leader in entertainment, so it’s no surprise that many of the biggest players in streaming, digital media and game development call the city home. But the city boasts plenty of non-entertainment innovation as well, with tech companies spanning verticals like AI, fintech, e-commerce and biotech. With major universities like Caltech, UCLA, USC and the nearby UC Irvine, the city has a steady supply of top-flight tech and engineering talent — not counting the graduates flocking to Los Angeles from across the world to enjoy its beaches, culture and year-round temperate climate.

Key Facts About Los Angeles Tech

  • Number of Tech Workers: 375,800; 5.5% of overall workforce (2024 CompTIA survey)
  • Major Tech Employers: Snap, Netflix, SpaceX, Disney, Google
  • Key Industries: Artificial intelligence, adtech, media, software, game development
  • Funding Landscape: $11.6 billion in venture capital funding in 2024 (Pitchbook)
  • Notable Investors: Strong Ventures, Fifth Wall, Upfront Ventures, Mucker Capital, Kittyhawk Ventures
  • Research Centers and Universities: California Institute of Technology, UCLA, University of Southern California, UC Irvine, Pepperdine, California Institute for Immunology and Immunotherapy, Center for Quantum Science and Engineering

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account