SMASH Logo

SMASH

Platform/DevOps Engineer P-136

Posted 5 Days Ago
Remote
Hiring Remotely in United States
Senior level
Remote
Hiring Remotely in United States
Senior level
This role involves managing cloud infrastructure reliability, incident response, cost optimization, and observability while ensuring high availability and compliance readiness.
The summary above was generated by AI

SMASH, Who we are?
We believe in long-lasting relationships with our talent. We invest time getting to know them and understanding what they seek as their professional next step.

We aim to find the perfect match. As agents, we pair our talent with our US clients, not only by their technical skills but as a cultural fit. Our core competency is to find the right talent fast.

This position is remote within the United States. You must have U.S. citizenship or a valid U.S. work permit to apply for this role.
Role summary
You will own infrastructure reliability, observability, and cost optimization for a production platform serving multiple customers under a 99.5% uptime SLA. This role focuses on building resilient, secure, and cost-efficient cloud infrastructure while leading incident response, monitoring, and compliance readiness initiatives.

Responsibilities

  • Ensure 99.5% uptime SLA across all production services and customer environments.

  • Design and maintain multi-region deployments to support geographic redundancy.

  • Implement automated failover mechanisms for databases, load balancers, and critical services.

  • Build and manage disaster recovery strategies, including automated backups and point-in-time recovery.

  • Lead incident detection, response, and postmortems, meeting defined SLAs for P0 issues.

  • Develop real-time observability dashboards for uptime, latency, error rates, and system health.

  • Monitor application and infrastructure performance metrics across customers.

  • Implement alerting, on-call rotations, escalation policies, and PagerDuty integrations.

  • Manage log aggregation and retention using SIEM platforms such as Splunk or Sumo Logic.

  • Support SOC 2 Type II preparation through security controls, monitoring, and documentation.

  • Implement vulnerability scanning, penetration testing coordination, and DLP controls.

  • Optimize cloud infrastructure costs through right-sizing, auto-scaling, and storage lifecycle policies.

  • Track and report infrastructure and API costs per customer, driving FinOps best practices.

  • Build automated runbooks and self-healing workflows for common incidents.

Requirements – Must-haves

  • Strong experience as a Site Reliability Engineer, DevOps Engineer, or Platform Engineer.

  • Deep expertise in AWS cloud architecture (ECS, EKS, RDS, Lambda, S3, CloudFront).

  • Proven experience with Infrastructure as Code using Terraform or CloudFormation.

  • Hands-on production experience with Kubernetes and container orchestration.

  • Strong knowledge of observability and monitoring tools (Datadog, New Relic, Prometheus, Grafana).

  • Experience managing on-call rotations, incident response, and post-incident reviews.

  • Solid understanding of security practices including SIEM, vulnerability scanning, and SOC 2 compliance.

  • Demonstrated experience in cloud cost optimization and FinOps practices.

  • Ability to operate independently and prioritize reliability in high-availability environments.

Nice-to-haves (optional)

  • Experience supporting SOC 2 Type II audits.

  • Background working in regulated or compliance-heavy environments (PHI/PII).

  • Experience implementing DLP and document scanning solutions.

  • Familiarity with AI/ML workload cost optimization.

  • Experience supporting SaaS platforms with customer-isolated environments.

Top Skills

AWS
CloudFormation
Cloudfront
Datadog
Ecs
Eks
Grafana
Kubernetes
Lambda
New Relic
Prometheus
Rds
S3
SIEM
Splunk
Sumo Logic
Terraform

Similar Jobs

21 Days Ago
Remote
5 Locations
60K-65K Annually
Senior level
60K-65K Annually
Senior level
Software
As a Senior DevOps Engineer, you'll enhance and maintain a high-scale SaaS platform, improve developer experience, and mentor engineers.
Top Skills: AWSAzureBashGCPGoGrafanaKubernetesLokiNode.jsOpentelemetryPrometheusPulumiPythonTerraform
17 Days Ago
Remote
21 Locations
115K-180K Annually
Senior level
115K-180K Annually
Senior level
Cloud • Social Media • Software
As a Senior DevOps Engineer, you'll enhance our infrastructure, support deployment, ensure security, and manage databases for a large-scale application.
Top Skills: CloudflareDatadogElasticsearchFastlyGithub ActionsGCPHelmKubernetesNew RelicPostgresRedisRuby On RailsRustScylladbSentryTerraformTypescriptVue
13 Days Ago
Easy Apply
Remote
United States
Easy Apply
156K-156K Annually
Senior level
156K-156K Annually
Senior level
Edtech
As a Senior DevOps Engineer at Udacity, you'll manage cloud services, support developers, implement production platforms, and innovate engineering solutions while mentoring others.
Top Skills: AWSCloudflareDockerGoKubernetesPythonRubyTerraformTypescript

What you need to know about the Los Angeles Tech Scene

Los Angeles is a global leader in entertainment, so it’s no surprise that many of the biggest players in streaming, digital media and game development call the city home. But the city boasts plenty of non-entertainment innovation as well, with tech companies spanning verticals like AI, fintech, e-commerce and biotech. With major universities like Caltech, UCLA, USC and the nearby UC Irvine, the city has a steady supply of top-flight tech and engineering talent — not counting the graduates flocking to Los Angeles from across the world to enjoy its beaches, culture and year-round temperate climate.

Key Facts About Los Angeles Tech

  • Number of Tech Workers: 375,800; 5.5% of overall workforce (2024 CompTIA survey)
  • Major Tech Employers: Snap, Netflix, SpaceX, Disney, Google
  • Key Industries: Artificial intelligence, adtech, media, software, game development
  • Funding Landscape: $11.6 billion in venture capital funding in 2024 (Pitchbook)
  • Notable Investors: Strong Ventures, Fifth Wall, Upfront Ventures, Mucker Capital, Kittyhawk Ventures
  • Research Centers and Universities: California Institute of Technology, UCLA, University of Southern California, UC Irvine, Pepperdine, California Institute for Immunology and Immunotherapy, Center for Quantum Science and Engineering

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account