CVS Health Logo

CVS Health

Principal Architect - Cloud and Observability

Reposted 12 Hours Ago
Be an Early Applicant
In-Office
Village of Homewood, IL
144K-288K Annually
Senior level
In-Office
Village of Homewood, IL
144K-288K Annually
Senior level
The Principal Architect will lead observability and hybrid cloud architecture, ensuring standards, reference designs and telemetry pipelines across multiple environments. Responsibilities include building architectures, guiding teams on cloud infrastructure, and fostering observability practices using various tools and frameworks.
The summary above was generated by AI

We’re building a world of health around every individual — shaping a more connected, convenient and compassionate health experience. At CVS Health®, you’ll be surrounded by passionate colleagues who care deeply, innovate with purpose, hold ourselves accountable and prioritize safety and quality in everything we do. Join us and be part of something bigger – helping to simplify health care one person, one family and one community at a time.

Position Summary

We're hiring a Principal Architect to take ownership of how we do observability and hybrid cloud at CVS Health. This person will sit within our Enterprise Architecture organization and be responsible for the architecture, standards, and technical direction behind our observability platforms and our multi-cloud infrastructure posture.

We run workloads across on-prem private cloud (OpenShift, KVM, Dell PowerFlex), Azure, AWS, and GCP. We need someone who can build and maintain the reference architectures, telemetry standards, and instrumentation patterns that let our engineering teams monitor all of that consistently. We've committed to an OpenTelemetry-first approach and use the Grafana stack (Mimir, Loki, Tempo) as our primary backends, but we also operate Datadog, Splunk, and Dynatrace in various parts of the org.

On the cloud side, there is real work to do around workload identity, runtime selection, autoscaling guidance, and FinOps. Teams are asking for concrete standards they can follow.

This is a hands-on role. You'll write architecture docs, build proof-of-concepts, configure OTel pipelines, and present to leadership.

*This position can work remotely from anywhere in the continental USA.

Responsibilities

Observability

  • Own the enterprise observability reference architecture covering metrics, logs, traces, and events across all environments (cloud and on-prem).
  • Drive the OpenTelemetry-first instrumentation strategy -- standard libraries, semantic conventions, collector topologies (DaemonSet, gateway, sidecar), and pipeline design.
  • Build and operate telemetry pipelines on Grafana Mimir, Loki, and Tempo, including multi-tenant configurations, retention policies, and capacity planning.
  • Define how we measure reliability: SLOs, SLIs, error budgets, and alerting frameworks -- consistently across all lines of business.
  • Own the integration between observability tooling and incident management (ServiceNow ITOM, xMatters).

Drive telemetry schema standards to ensure teams emit data that is useful downstream, not just technically compliant.

Hybrid Multi-Cloud

  • Build and maintain reference architectures for our hybrid footprint: OpenShift on-prem with KVM/libvirt and Dell PowerFlex storage, plus Azure, AWS, and GCP.
  • Lead standards work around workload identity and federation using SPIFFE/SPIRE and cloud-native IAM patterns to move away from static secrets.
  • Provide guidance on compute runtime selection -- containers vs. VMs vs. bare metal vs. serverless -- with a clear decision framework for teams.
  • Help teams connect autoscaling and capacity planning behavior to actual telemetry signals.

Push FinOps maturity forward by integrating cost data into the observability stack, establishing unit economics, and working toward open billing standards like FOCUS.

AI + Observability

  • Identify where AI/ML adds practical value in our observability stack -- anomaly detection, root cause analysis, log clustering, and smarter alerting.
  • Define observability standards for AI-powered systems (agents, RAG pipelines) -- covering latency, token costs, model drift, and related signals.

Ensure new AI-powered platforms are instrumented correctly from day one.

Architecture Community

  • Participate in cross-functional architecture working groups focused on observability and hybrid cloud standards.
  • Publish architecture decision records and reference implementations that teams can actually use.
  • Mentor architects and platform engineers; conduct architecture reviews to raise the bar across the org.
  • Work with security and compliance on HIPAA, SOX, and PCI requirements as they apply to telemetry and cloud infrastructure.

Represent CVS Health in vendor evaluations and stay connected to the open-source ecosystem (CNCF, OpenTelemetry, Grafana Labs).

Required Qualifications

  • 10+ years in infrastructure, cloud architecture, platform engineering, or SRE
  • 8+ years of architecture work in observability, cloud infrastructure, or both at a large enterprise
  • Solid experience with at least two of Azure, AWS, or GCP -- including networking, identity, compute, and storage
  • 5+ years with Kubernetes in production (OpenShift, EKS, AKS, or GKE)
  • 5+ years with OpenTelemetry or similar frameworks (collectors, SDKs, semantic conventions, pipeline design)
  • 5+ years with observability platforms: Grafana/Mimir/Loki/Tempo, Prometheus, Datadog, Splunk, Dynatrace, or comparable tools
  • Experience defining SLOs/SLIs and building alerting strategies at an organizational level
  • Proven track record writing architecture standards that other teams adopted and followed

Able to communicate clearly with both engineers and senior leadership

Preferred Qualifications

  • On-prem / private cloud experience (OpenShift Virtualization, KVM/libvirt, VMware, Dell PowerFlex or similar storage)
  • Workload identity (SPIFFE/SPIRE) and zero-trust networking
  • Infrastructure-as-code (Terraform, Pulumi, Helm, ArgoCD)
  • Streaming platforms such as Kafka or Confluent, especially in telemetry pipeline contexts
  • AIOps or ML-based anomaly detection experience
  • FinOps background -- cloud cost optimization, chargeback, unit economics
  • Service mesh (Istio, Envoy, Linkerd) or eBPF-based tools (Cilium, Pixie)
  • Involvement in open-source communities (CNCF, OpenTelemetry, etc.)
  • Healthcare, insurance, or financial services experience (HIPAA/SOX familiarity)
  • Cloud certifications are a plus but not required

Education

Bachelor's degree in Computer Science, Engineering, or a related field. Equivalent work experience accepted.

Pay Range

The typical pay range for this role is:

$144,200.00 - $288,400.00


This pay range represents the base hourly rate or base annual full-time salary for all positions in the job grade within which this position falls.  The actual base salary offer will depend on a variety of factors including experience, education, geography and other relevant factors.  This position is eligible for a CVS Health bonus, commission or short-term incentive program in addition to the base pay range listed above.  This position also includes an award target in the company’s equity award program. 
 

Our people fuel our future. Our teams reflect the customers, patients, members and communities we serve and we are committed to fostering a workplace where every colleague feels valued and that they belong.

Great benefits for great people

We take pride in offering a comprehensive and competitive mix of pay and benefits that reflects our commitment to our colleagues and their families.

This full‑time position is eligible for a comprehensive benefits package designed to support the physical, emotional, and financial well‑being of colleagues and their families. The benefits for this position include medical, dental, and vision coverage, paid time off, retirement savings options, wellness programs, and other resources, based on eligibility.


Additional details about available benefits are provided during the application process and on
Benefits Moments.

We anticipate the application window for this opening will close on: 06/29/2026

Qualified applicants with arrest or conviction records will be considered for employment in accordance with all federal, state and local laws.

Similar Jobs

2 Hours Ago
Hybrid
153K-229K Annually
Expert/Leader
153K-229K Annually
Expert/Leader
AdTech • Digital Media • Marketing Tech
Lead reliability, performance, and operational excellence for large-scale ad-tech platforms. Design and operate monitoring, incident response, capacity planning, automation, and secure production practices. Mentor engineers, influence architecture for production readiness, drive change management, and participate in on-call rotations for complex distributed systems.
Top Skills: AnsibleAWSAws OpsworksC++ChefDockerEc2GitGoIamJenkinsKubernetesLambdaLinuxNaclsPuppetPythonRoute 53S3ScalaSecurity GroupsSQLSubnetsVpc
8 Hours Ago
Hybrid
159K-259K Annually
Senior level
159K-259K Annually
Senior level
Artificial Intelligence • Cloud • Internet of Things • Software • Cybersecurity • Industrial
The Senior Manager - Data Analytics leads the strategy and delivery of data and AI solutions to enhance engineering transformation and connectivity quality at Caterpillar.
Top Skills: AICloud EcosystemsData AnalyticsData EngineeringGenaiIotMlTelemetry
8 Hours Ago
Hybrid
148K-240K Annually
Senior level
148K-240K Annually
Senior level
Artificial Intelligence • Cloud • Internet of Things • Software • Cybersecurity • Industrial
The Engineering Transformation Manager leads teams to ensure data quality and integrates automated quality solutions, collaborating with various engineering and product teams.
Top Skills: AIAWSAzureData PipelinesData Quality FrameworksEltETLGCPKafkaMl

What you need to know about the Los Angeles Tech Scene

Los Angeles is a global leader in entertainment, so it’s no surprise that many of the biggest players in streaming, digital media and game development call the city home. But the city boasts plenty of non-entertainment innovation as well, with tech companies spanning verticals like AI, fintech, e-commerce and biotech. With major universities like Caltech, UCLA, USC and the nearby UC Irvine, the city has a steady supply of top-flight tech and engineering talent — not counting the graduates flocking to Los Angeles from across the world to enjoy its beaches, culture and year-round temperate climate.

Key Facts About Los Angeles Tech

  • Number of Tech Workers: 375,800; 5.5% of overall workforce (2024 CompTIA survey)
  • Major Tech Employers: Snap, Netflix, SpaceX, Disney, Google
  • Key Industries: Artificial intelligence, adtech, media, software, game development
  • Funding Landscape: $11.6 billion in venture capital funding in 2024 (Pitchbook)
  • Notable Investors: Strong Ventures, Fifth Wall, Upfront Ventures, Mucker Capital, Kittyhawk Ventures
  • Research Centers and Universities: California Institute of Technology, UCLA, University of Southern California, UC Irvine, Pepperdine, California Institute for Immunology and Immunotherapy, Center for Quantum Science and Engineering

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account