Company Description
Zayo provides mission-critical bandwidth to the world’s most impactful companies, fueling the innovations that are transforming our society. Zayo’s 141,000-mile network in North America and Europe includes extensive metro connectivity to thousands of buildings and data centers. Zayo’s communications infrastructure solutions include dark fiber, private data networks, wavelengths, Ethernet, and dedicated Internet access. Zayo serves wireless and wireline carriers, media, tech, content, finance, healthcare and other large enterprises.
Do you dream in high scalable systems, thrive in fast-paced environments and enjoy tackling complex technical challenges? Are you passionate about diving into the details and making the most accurate and durable network observability systems? If so, then join our team as a Principal Site Reliability Engineer, Network Observability!
We're looking for a talented Principal Site Reliability Engineer, Network Observability to play a critical role in ensuring the uptime, performance, and scalability of our network with a focus on our network observability systems.
Responsibilities:
Automation: Work with the NOC and software engineering teams to discover processes around network observability that can be automated, and then create a technical plan to implement both the the technical and process changes.
Monitoring and Alerting: Work with the network observability team to design and implement effective monitoring and alerting to proactively identify and address issues.
Incident Management: Own the incident lifecycle, from leading root cause analysis and resolution to implementing preventative measures to avoid future occurrences. Focus on chronic and big picture issues that may have complex resolutions spanning departments, process, and technical elements.
Reliability Engineering: Proactively identify and mitigate potential system risks, focusing on automation, monitoring, and tooling to ensure high service availability.
Scalability and Performance: Design and implement solutions to ensure our infrastructure can handle ever-growing demands while maintaining optimal application performance and providing the best possible detail on service degradation and outages to the NOC. Have a laser focus on reducing mean time it takes for the NOC to correctly diagnose issues and automate troubleshooting and information collection.
Collaboration: Work closely with developers, product managers, and engineers to translate business needs into robust and reliable technical solutions. Become the beacon for best practices and efficient processes throughout the organization.
Qualifications:
Bachelor's degree in Computer Science, Engineering, or a related field (or equivalent experience.)
Minimum of twelve (12) years of experience in a Senior Network Engineer, Senior Site Reliability Engineer or related role.
Strong understanding of system administration, Linux, and proficiency in scripting languages (Python and various shells.)
Previous experience working both in a NOC and in an upper level network engineering role.
Exceptionally strong working knowledge of networking concepts and application protocols, especially TCP/IP, BGP, DNS, TLS, and HTTP/S and network services.
Expert at developing automation tools for monitoring, alerting, and deployment to ensure efficient and reliable operations.
Expert at designing and implementing monitoring systems at scale.
Experience with various monitoring platforms such as SevOne, Assure1,Prometheus, and Nagios and various vendor EMS/NMS systems.
Previous work in large scale distributed production environments.
Experience with a variety of cloud platforms and tools (AWS, Google, etc.)
Experience with a variety of monitoring and alerting tools (Grafana, Cacti, etc.)
Proven leadership skills, with the ability to mentor and inspire others.
Excellent problem-solving, analytical, and critical thinking skills.
A passion for automation and building efficient systems.
Expert experience working in a highly automated environment.
Preferred Experience:
Experience working with various vendor APIs (or netconf) including Nokia, Juniper, Fujitsu, Infinera, Cisco, and Ciena.
Experience with various network orchestration platforms such as Ciena Blue Planet MDSO, Cisco NSO, Nokia NSP, or others.
Experience automating network troubleshooting.
Estimated Base Salary Range: $114,900 - $164,200 USD/annually.
The base pay range shown is a guideline and reasonable estimate for this role. It takes into account the wide variety of factors that are considered in making compensation decisions. Actual compensation offered may vary from the posted range based upon geographic location, work experience, skill level, certifications, and other business and organizational needs. Non- sales roles may be eligible to participate in a discretionary annual incentive plan. Sales roles may be eligible to participate in a sales incentive plan.
Additionally, this position may be eligible for certain benefits, such as health insurance, life insurance, disability retirement plans, paid time off.
The posting will be active for a minimum of 3 days. The active posting will continue to extend by 3 days until the position is filled.
Benefits, Rewards & Wellness
Excellent Health, Dental & Vision Insurance
Retirement 401(k) Savings Plan
Generous paid time off policy including paid parental leave
Zayo provides equal employment opportunities to all employees and applicants for employment and prohibits discrimination and harassment of any type without regard to race, color, religion, age, sex, national origin, disability status, genetics, protected veteran status, sexual orientation, gender identity or expression, or any other characteristic protected by federal, state, provincial or local laws.
This policy applies to all terms and conditions of employment, including recruiting, hiring, placement, promotion, termination, layoff, recall, transfer, leaves of absence, compensation and training.
Top Skills
Similar Jobs
What you need to know about the Los Angeles Tech Scene
Key Facts About Los Angeles Tech
- Number of Tech Workers: 375,800; 5.5% of overall workforce (2024 CompTIA survey)
- Major Tech Employers: Snap, Netflix, SpaceX, Disney, Google
- Key Industries: Artificial intelligence, adtech, media, software, game development
- Funding Landscape: $11.6 billion in venture capital funding in 2024 (Pitchbook)
- Notable Investors: Strong Ventures, Fifth Wall, Upfront Ventures, Mucker Capital, Kittyhawk Ventures
- Research Centers and Universities: California Institute of Technology, UCLA, University of Southern California, UC Irvine, Pepperdine, California Institute for Immunology and Immunotherapy, Center for Quantum Science and Engineering