The Site Reliability Engineer will ensure service quality and availability, manage AWS infrastructure, implement SRE best practices, and collaborate with teams.
Are you motivated by an incredible sense of purpose in doing work that helps keep people safe? Are you passionate about innovating on cutting edge technology to develop robust architecture principles, operability guidelines, progressive scaling methodologies, and implementing other sophisticated techniques to reliably operate infrastructure at scale? Do you have an appetite for securing systems, streamlining efficiency, automating away toil, and proactively eliminating problems before they occur? If so, this position is a perfect opportunity for you to join the Everbridge Federal Platform team.
As part of the Everbridge Federal Platform team, you will play a critical role in ensuring the overall service quality and availability of Everbridge's solutions. This includes designing, deploying, managing services at scale, evangelizing both SRE best practices, and helping to push the boundaries of the latest technology. The platforms that you will support are critical to the delivery of time sensitive information to help keep people safe and businesses running. We are dedicated, passionate people who are committed to customer service and doing the right thing.
What You'll Do:
- Keep people safe and businesses running.
- Be an integral member of the team implementing our platform in a DoD IL4 cloud environment.
- Maintain infrastructure from conception to completion within AWS. Including services such as VPCs, EC2, Transit Gateways, IAM roles and policies, Route53, S3, SGs, NACLs
- Build upon the operational availability, security, scalability, efficiency, monitoring, instrumentation, and overall service reliability of Everbridge's solutions.
- Collaborate across Agile teams with Architects, Developers, Quality, Data, Security, and other engineers on designing and implementing highly reliable solutions.
- Research and implement SRE and best practices and by creating automation, cross-functional collaboration, and data-driven decisions to reinforce the integrity and reliability of our systems.
- Participate in a rotating on-call rotation to resolve production escalations
What You'll Bring:
- 2+ years of technical AWS experience, managing and owning systems in a production environment
- 1+ years of Kubernetes experience (EKS, AKS, GKE, Self-managed)
- 2+ years of Terraform or similar IaC experience
- 2+ years of experience with MongoDB or ElasticSearch/ELK administration
- 2+ years of experience with application development or writing automation in Java
- Experience with the following tooling: GitLab CICD, Packer, Docker, EKS, Kubernetes, Spinnaker, Helm, Argo, Jenkins
- Experience with Telemetry tools such as Datadog, SumoLogic, Grafana, Prometheus
- Experience with configuration management tools such as Salt, Ansible, AWS user_data
- Experience with a DevOps/SRE production environment
- Experience with Agile practices
- UNIX/Linux experience
- Experience working on DoD programs
- Currently hold a Secret Clearance or a be a US citizen with the ability to obtain a Secret Clearance
- Must have or be able to obtain and maintain DoD 8140 “Intermediate” level or higher certification (formally DoD 8170 IAM Level II)
The reasonably estimated salary for this role at Everbridge ranges from $84,400 - $112,500 and may also include variable compensation. Actual compensation is based on factors such as the candidate's skills, qualifications, and experience. In addition, Everbridge offers a wide range of best in class, comprehensive and inclusive employee benefits for this role including healthcare, dental, parental planning, and mental health benefits, disability income benefits, life and AD&D insurance, a 401(k) plan and match, paid time off, and fitness reimbursements
#LI-HG1
#LI-Remote
Top Skills
Ansible
Argo
AWS
Datadog
Docker
Elasticsearch
Gitlab Cicd
Grafana
Helm
Java
Jenkins
Kubernetes
Linux
MongoDB
Packer
Prometheus
Salt
Spinnaker
Sumologic
Terraform
Unix
Everbridge Pasadena, California, USA Office
155 N Lake Ave, Pasadena, CA, United States, 91101
Similar Jobs
Artificial Intelligence • Fintech • Machine Learning • Social Impact • Software
As a Principal Site Reliability Engineer, lead SRE practices, collaborate cross-functionally, ensure system resiliency, and drive operational improvements.
Top Skills:
CdkCloudFormationDatadogGoJavaScriptPrometheusPythonTerraformTypescript
Artificial Intelligence • Cloud • HR Tech • Information Technology • Productivity • Software • Automation
Design and implement infrastructure for AI workloads, optimize GPU clusters, improve SRE practices, and mentor colleagues in a senior engineering role.
Top Skills:
AnsibleGitlab CiGoHelmJavaKubernetesPrometheusPythonSplunk
Artificial Intelligence • Cloud • HR Tech • Information Technology • Productivity • Software • Automation
Lead the Site Reliability Engineering efforts within DevSecOps, ensuring operational excellence, security, reliability, and performance of security services, while mentoring a high-performing team.
Top Skills:
AnsibleAWSAzureBashDockerElkGCPGrafanaKubernetesPrometheusPythonTerraform
What you need to know about the Los Angeles Tech Scene
Los Angeles is a global leader in entertainment, so it’s no surprise that many of the biggest players in streaming, digital media and game development call the city home. But the city boasts plenty of non-entertainment innovation as well, with tech companies spanning verticals like AI, fintech, e-commerce and biotech. With major universities like Caltech, UCLA, USC and the nearby UC Irvine, the city has a steady supply of top-flight tech and engineering talent — not counting the graduates flocking to Los Angeles from across the world to enjoy its beaches, culture and year-round temperate climate.
Key Facts About Los Angeles Tech
- Number of Tech Workers: 375,800; 5.5% of overall workforce (2024 CompTIA survey)
- Major Tech Employers: Snap, Netflix, SpaceX, Disney, Google
- Key Industries: Artificial intelligence, adtech, media, software, game development
- Funding Landscape: $11.6 billion in venture capital funding in 2024 (Pitchbook)
- Notable Investors: Strong Ventures, Fifth Wall, Upfront Ventures, Mucker Capital, Kittyhawk Ventures
- Research Centers and Universities: California Institute of Technology, UCLA, University of Southern California, UC Irvine, Pepperdine, California Institute for Immunology and Immunotherapy, Center for Quantum Science and Engineering