Galaxy Logo

Galaxy

VP, Site Reliability Engineer

Posted 25 Days Ago
Easy Apply
Remote
Hiring Remotely in USA
Senior level
Easy Apply
Remote
Hiring Remotely in USA
Senior level
The VP, Site Reliability Engineer will architect and maintain AWS infrastructure, optimize container workloads, and drive automation and reliability initiatives. Responsibilities include migration from VMs to containers, incident response, and cross-team collaboration.
The summary above was generated by AI

Who We Are:
Galaxy is a global leader in digital assets and data center infrastructure, delivering solutions that accelerate progress in finance and artificial intelligence. We believe that blockchain and digital asset innovation will transform how value moves through the world – and we’re building the products and services to make that future a reality.
 
Our institutional digital assets platform spans trading, investment banking, asset management, staking, self-custody, and tokenization technology. We also invest in and operate cutting-edge data center infrastructure to power AI and high-performance computing, addressing the growing demand for scalable energy and compute in the U.S.
 
We work at the intersection of finance and technology, helping institutions, startups, and developers navigate a digitally native economy. Led by CEO and Founder Michael Novogratz, our team blends deep crypto expertise with institutional experience and a shared commitment to shaping the future of Web3 and AI.
 
Galaxy is headquartered in New York City, with offices across North America, Europe, the Middle East, and Asia.
 
To learn more about our businesses and products, visit www.galaxy.com.

What We Value:

We are a diverse team of free thinkers, and fast movers united to help investors and creators energize the global economy. We are looking for individuals who thrive in a culture of builders and overachievers and embrace high performance, transparent feedback, and a mission-first approach. Our culture shapes our way of working and gets us where we want to be.

  • Seek Excellence.
  • Be Selective To Be Effective.
  • Be Highly Aligned, Loosely Coupled.
  • Disagree Transparently.
  • Encourage Independent Decision-Making.
  • Build Dream Teams.

Who You Are: 

You are a Senior SRE specializing in AWS and containerized infrastructure. You thrive working hands-on, tackling migration from legacy VMs to container ecosystems with a focus on EKS, automation, and reliability. 


What You’ll Do: 

Reliability Engineering:  

  • Architect, deploy, and maintain robust, scalable, secure AWS-based infrastructure.
  • Drive adoption and optimization of EKS and Kubernetes for containerized workloads.
  • Support migration initiatives, moving workloads from legacy VMs to containers in AWS.
  • Implement and fine-tune SLOs, SLAs, and error budgets to balance innovation and stability.
  • Collaborate on best practices with Security and Engineering teams for workload reliability. 

Automation & Infrastructure as Code: 

  • Build Infrastructure as Code (IaC) with Terraform; maintain compliant, repeatable environments.
  • Enhance CI/CD pipelines for efficient, secure, and reliable cloud delivery.
  • Develop and refine automated solutions for autoscaling, failover, and disaster recovery. 

 Observability & Incident Response 

  • Design and implement metrics, logging, and tracing tools (Datadog, OpenTelemetry).
  • Set up robust monitoring and alerting to proactively detect and address failures.
  • Lead incident analysis and post-mortems; drive improvements in operational playbooks. 

 AWS & Cloud SME 

  • Serve as a subject matter expert for AWS, EKS, and cloud-native tooling within the SRE team.
  • Optimize AWS resources, cost management, and resiliency best practices.
  • Ensure secure key management and regulatory compliance for decentralized workloads. 

 What We’re Looking For: 

  • 8+ years in SRE, DevOps, or Infrastructure Engineering (IC capacity preferred).
  • Deep hands-on expertise in AWS, Kubernetes/EKS, and containerization.
  • Extensive IaC experience (Terraform) and cloud-native automation.
  • Proven track record migrating VM-based workloads to containers in AWS at scale.
  • Strong experience with observability stacks (Datadog, Prometheus, Grafana, OpenTelemetry).
  • Excellent analytical, problem-solving, and incident management abilities.
  • Clear communicator who thrives in team environments, collaborating cross-functionally. 

Bonus Points: 

  • Experience supporting blockchain infrastructure is a strong plus. 

Galaxy respects diversity and seeks to provide equal employment opportunities to all employees and job applicants for employment without regard to actual or perceived age, race, color, creed, religion, sex or gender (including pregnancy, childbirth, lactation and related medical conditions), gender identity or gender expression (including transgender status), sexual orientation, marital or partnership or caregiver status, ancestry, national origin, citizenship status, disability, military or veteran status, protected medical condition as defined by applicable state or local law, genetic information or predisposing genetic characteristic, or other characteristic protected by applicable federal, state, or local laws and ordinances.

We will endeavor to make a reasonable accommodation to the known limitations of a qualified applicant with a disability unless the accommodation would impose an undue hardship on the operation of our business. If you believe you require such assistance to complete the application process or to participate in an interview, please contact [email protected]

Top Skills

AWS
Datadog
Eks
Kubernetes
Opentelemetry
Terraform

Similar Jobs

2 Hours Ago
Remote or Hybrid
TX, USA
Expert/Leader
Expert/Leader
Cloud • Computer Vision • Information Technology • Sales • Security • Cybersecurity
The role involves leading cloud security architecture, designing enterprise-scale security solutions, mentoring engineers, and driving technical innovations for security tools and developer productivity.
Top Skills: Ai/MlAWSEndpoint ProtectionIdentity ManagementKubernetesMicroservicesNetwork SecuritySso
6 Hours Ago
Remote
USA
Senior level
Senior level
Blockchain • Fintech • Payments • Financial Services • Cryptocurrency • Web3 • Infrastructure as a Service (IaaS)
The CX Analytics & Operations Lead will develop dashboards, automate reporting, analyze trends, standardize workflows, and enhance CX efficiency while collaborating cross-functionally.
Top Skills: LeanSix SigmaZendesk
6 Hours Ago
In-Office or Remote
San Francisco, CA, USA
85K-133K Annually
Senior level
85K-133K Annually
Senior level
Cloud • Information Technology • Productivity • Security • Software • App development • Automation
The Success Manager will drive customer engagement, guide customer journeys, deliver scalable value, mitigate churn risks, and maintain operational excellence with major focus on customer satisfaction and collaboration across teams.
Top Skills: GainsightSalesforceTableau

What you need to know about the Los Angeles Tech Scene

Los Angeles is a global leader in entertainment, so it’s no surprise that many of the biggest players in streaming, digital media and game development call the city home. But the city boasts plenty of non-entertainment innovation as well, with tech companies spanning verticals like AI, fintech, e-commerce and biotech. With major universities like Caltech, UCLA, USC and the nearby UC Irvine, the city has a steady supply of top-flight tech and engineering talent — not counting the graduates flocking to Los Angeles from across the world to enjoy its beaches, culture and year-round temperate climate.

Key Facts About Los Angeles Tech

  • Number of Tech Workers: 375,800; 5.5% of overall workforce (2024 CompTIA survey)
  • Major Tech Employers: Snap, Netflix, SpaceX, Disney, Google
  • Key Industries: Artificial intelligence, adtech, media, software, game development
  • Funding Landscape: $11.6 billion in venture capital funding in 2024 (Pitchbook)
  • Notable Investors: Strong Ventures, Fifth Wall, Upfront Ventures, Mucker Capital, Kittyhawk Ventures
  • Research Centers and Universities: California Institute of Technology, UCLA, University of Southern California, UC Irvine, Pepperdine, California Institute for Immunology and Immunotherapy, Center for Quantum Science and Engineering

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account