Cayuse LLC Logo

Cayuse LLC

Senior SRE

Posted 11 Days Ago
Remote
Hiring Remotely in USA
50K-90K
Senior level
Remote
Hiring Remotely in USA
50K-90K
Senior level
The Senior SRE will manage the Global SRE team, enhance platform reliability, oversee incident response, and ensure operational efficiency while collaborating across departments to improve service quality.
The summary above was generated by AI

The exciting world of scientific research is fueled by people with a passion for solving complex problems. At Cayuse, we are committed to our customers’ success by empowering organizations to conduct globally connected research that advances their impact on science, discovery and society. We build on that commitment with proven, integrated and easy-to-use technology that delivers exceptional value, and world class service and support that accelerates outcomes.

But we are more than just an empowering platform powered by advanced technologies. We are a collaboration of exceptional, highly skilled people with multi-disciplinary expertise, and are building our team to support our ambitious growth plans. Cayuse’s foundational strength comes from our customer and employee focused values and commitment to industry-leading solutions. It’s an exciting time to become a key member of our growing team.

The Senior SRE is responsible for working closely with the Director, Technology to ensure smooth and efficient operation of our cloud environment.  We are currently seeking a dynamic leader with in-depth experience leading a Site Reliability Engineering team. This individual will utilize production metrics and performance to provide expert advice on the platform designs, standards, requirements implementation, and product releases. The ideal candidate for this role will have a deep understanding of practices around automation and SRE principles which includes programs to increase visibility into runtime performance as well as optimizing the systems to meet all of our SLAs. This individual will also work to influence future designs and changes to improve our client's overall satisfaction with our SaaS products.  The Senior SRE will have the responsibility of assisting with the integration of the Cayuse strategic plan, in relation to our internal and external cloud environments.


Responsibilities

  • Own Site Reliability Engineering and Administration for product platforms and services
  • Oversee day-to-day responsibilities of Global SRE team
  • Drive accountability for quality aspects in release, system performance, platform availability, operational efficiency, and risk management
  • Develop and lead a strategy to build SRE practice with an emerging team of engineers for scale
  • Lead, automate and operate our platforms and make it reliable, stable, and scalable as our usage grows and technology transforms
  • Develop and lead incident response process, disaster recovery planning, and production support
  • Define what it means for the system to be available and dictate the availability SLO of the system
  • Review existing processes and recommend changes or institute new processes as needed
  • Work closely with our product and engineering leadership to set standards around patterns, frameworks, technologies, and processes to promote a simple and consistent approach across multiple types of services
  • Provide operational KPI metrics and reports to senior management
  • Lead teams to improve services through rigorous testing and release procedures
  • Form cross-functional partnerships and work closely with developers and product managers to identify productivity, production quality and coverage issues, and provide insight on improving both through scale and tooling
  • Build and support on-prem environments, as needed
  • Collaborate with Compliance & Security teams to implement and maintain controls defined in compliance programs
  • Prepare and present reports of incidents and remediations
  • Assess system data and error logs, along with user reports, to determine areas for improvement or repair
  • Oversee the coordination of security operations during high-risk events
  • Improve service observability, incident response, and maintain operational excellence
  • Establish key objectives and project prioritization for the teams
  • Build appropriate tooling, implement metrics and reports to help proactively detect issues and resolve them before impacting our customer experience

Qualifications

  • 5+ years as a SRE
  • 4+ years working with public cloud technologies (AWS preferred).
  • 4+ years experience in developing monitoring tools and log analysis tools in managing and/or influencing infrastructure services to ensure application service uptime and user experience.
  • Experience implementing and managing security controls and tools.
  • Proven record of team and functional transformation.
  • Quality first mindset with a strong background and experience with developing products for a global audience at scale.
  • Experience working and/or leading in an Agile environment.
  • Ability to lead across geographically disparate location.
  • Deep understanding of Site Reliability Engineering (SRE) philosophy, platforms and tools, SLA management, incident resolution, and automation.
  • Proven strategic and tactical thinker, able and comfortable translating business strategy into architectures/plans and equally comfortable rolling-up-your sleeves to problem solve with your team.
  • Ability to multi-task and effectively manage multiple projects, competing priorities, and adeptly re-prioritize based on changing needs.

Benefits

  • Competitive Medical Benefits (PPO + HSA available)
  • Vision, Dental, Short-Term Disability fully covered by Cayuse
  • Unlimited PTO + Holidays + Flexible Work Schedule
  • Remote Work Stipend
  • Equal Paid Parental Leave
  • 401k with Employer Matching
  • Quarterly Wellness Reimbursement
  • Remote Work Environment, supporting the Ultimate Employee Experience 

Cayuse does not accept agency resumes. Please do not forward resumes to our jobs alias or any Cayuse employees. Cayuse is not responsible for any fees related to unsolicited resumes.

Our culture is one of inclusion and belonging where everyone feels respected, treated justly, supported and nourished. We all share responsibility for creating and sustaining a work environment where differences are celebrated and we are empowered to strive for excellence. We’re proud to be an equal opportunity employer and actively seek to recruit, develop, and retain a diverse and talented workforce.

Top Skills

AWS
Log Analysis Tools
Monitoring Tools
Site Reliability Engineering

Similar Jobs

4 Hours Ago
Easy Apply
Remote
Hybrid
United States
Easy Apply
127K-249K Annually
Senior level
127K-249K Annually
Senior level
Big Data • Cloud • Software • Database
The Senior Site Reliability Engineer defines observability standards, builds infrastructure for monitoring services, and collaborates to improve system reliability and performance.
Top Skills: AWSFluentbitGCPJaegerKubernetesAzureMongoDBQuickwitSplunkVectorVictoriametrics
5 Days Ago
Remote
Hybrid
2 Locations
183K-210K Annually
Senior level
183K-210K Annually
Senior level
Cloud • Greentech • Other • Energy
You'll optimize Crusoe's compute infrastructure, focusing on virtualization, performance tuning, and kernel optimizations for AI workloads.
Top Skills: CCi/CdGoHypervisorsInfrastructure As CodeKvmLinuxQemuRust
11 Days Ago
Remote
Hybrid
2 Locations
183K-210K Annually
Senior level
183K-210K Annually
Senior level
Cloud • Greentech • Other • Energy
As a Senior Site Reliability Engineer at Crusoe, you'll ensure system reliability and performance, automate operations, and improve infrastructure through collaboration and monitoring.
Top Skills: AnsibleCircleCICloudFormationDockerGithub ActionsGitlab Ci/CdGoKubernetesPythonTerraform

What you need to know about the Los Angeles Tech Scene

Los Angeles is a global leader in entertainment, so it’s no surprise that many of the biggest players in streaming, digital media and game development call the city home. But the city boasts plenty of non-entertainment innovation as well, with tech companies spanning verticals like AI, fintech, e-commerce and biotech. With major universities like Caltech, UCLA, USC and the nearby UC Irvine, the city has a steady supply of top-flight tech and engineering talent — not counting the graduates flocking to Los Angeles from across the world to enjoy its beaches, culture and year-round temperate climate.

Key Facts About Los Angeles Tech

  • Number of Tech Workers: 375,800; 5.5% of overall workforce (2024 CompTIA survey)
  • Major Tech Employers: Snap, Netflix, SpaceX, Disney, Google
  • Key Industries: Artificial intelligence, adtech, media, software, game development
  • Funding Landscape: $11.6 billion in venture capital funding in 2024 (Pitchbook)
  • Notable Investors: Strong Ventures, Fifth Wall, Upfront Ventures, Mucker Capital, Kittyhawk Ventures
  • Research Centers and Universities: California Institute of Technology, UCLA, University of Southern California, UC Irvine, Pepperdine, California Institute for Immunology and Immunotherapy, Center for Quantum Science and Engineering

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account