Develop and optimize HPC clusters for on-premises and cloud environments, manage performance, automate processes, and support space missions.
The Aerospace Corporation is the trusted partner to the nation's space programs, solving the hardest problems and providing unmatched technical expertise. As the operator of a federally funded research and development center (FFRDC), we are broadly engaged across all aspects of space- delivering innovative solutions that span satellite, launch, ground, and cyber systems for defense, civil and commercial customers. When you join our team, you'll be part of a special collection of problem solvers, thought leaders, and innovators. Join us and take your place in space.
Job Summary
The Aerospace Corporation is seeking a talented and motivated High-Performance Computing (HPC) Engineer (Site Reliability Engineer Staff III/IV) to join our Computational Services team. In this role, you will be responsible for developing, implementing, and optimizing HPC clusters that support both on-premises and cloud environments. You will work alongside rocket scientists and engineers, tackling complex space enterprise challenges and contributing directly to the success of critical national space assets. We value a collaborative, proactive mindset and a shared commitment to engineering excellence.
Work Model
This is a full-time position located in either Chantilly, VA, or El Segundo, CA, with an expectation of 100% onsite work.
What You'll Be Doing
What You Need to be Successful
Minimum Requirements for the Site Reliability Engineer Staff III:
In addition to the above, the minimum requirements for the Site Reliability Engineer Staff IV include:
How You Can Stand Out
It would be impressive if you have one or more of these:
We offer a competitive compensation package where you'll be rewarded based on your performance and recognized for the value you bring to our business. The grade-based pay range for this job is listed below. Individual salaries within that range are determined through a wide variety of factors including but not limited to education, experience, knowledge and skills.
(Min - Max)
$135,200 - $220,000
Pay Basis: Annual
Leadership Competencies
Our leadership philosophy is simple: every employee, regardless of level and role, can demonstrate leadership. At Aerospace, our commitment is our people. To cultivate our talent and ensure that we have a strong pipeline of future leaders, we want individuals who:
Ways We Reward Our Employees
During your interview process, our team will provide details of our industry-leading benefits.
Benefits vary and are applicable based on Job Type. A few highlights include:
We are all unique, from various backgrounds and all walks of life, yet one thing bonds all of us to each other-the belief that we can make a difference. This core belief empowers us to do our best work at The Aerospace Corporation.
Equal Opportunity Commitment
The Aerospace Corporation is an equal opportunity employer. All qualified applicants will receive consideration for employment and will not be discriminated against on the basis of race, age, sex (including pregnancy, childbirth, and related medical conditions), sexual orientation, gender, gender identity or expression, color, religion, genetic information, marital status, ancestry, national origin, protected veteran status, physical disability, medical condition, mental disability, or disability status and any other characteristic protected by state or federal law. If you're an individual with a disability or a disabled veteran who needs assistance using our online job search and application tools or need reasonable accommodation to complete the job application process, please contact us by phone at 310.336.5432 or by email at [email protected] . You can also review Know Your Rights: Workplace Discrimination is Illegal.
Job Summary
The Aerospace Corporation is seeking a talented and motivated High-Performance Computing (HPC) Engineer (Site Reliability Engineer Staff III/IV) to join our Computational Services team. In this role, you will be responsible for developing, implementing, and optimizing HPC clusters that support both on-premises and cloud environments. You will work alongside rocket scientists and engineers, tackling complex space enterprise challenges and contributing directly to the success of critical national space assets. We value a collaborative, proactive mindset and a shared commitment to engineering excellence.
Work Model
This is a full-time position located in either Chantilly, VA, or El Segundo, CA, with an expectation of 100% onsite work.
What You'll Be Doing
- Design and implement HPC solutions that optimize resource utilization across diverse workloads in both classified and unclassified settings.
- Manage a 10,000-core classified cluster and a 3,000-core unclassified cluster to ensure peak performance.
- Deliver high-quality HPC infrastructure design, automated provisioning, and system configuration.
- Develop and deploy automation solutions using tools such as Ansible or Puppet.
- Optimize AI workloads and GPU computing performance.
- Monitor, analyze, and tune HPC system performance, utilization, and resource allocation to maintain operational efficiency.
- Work closely with scientists and engineers to support new and ongoing projects, and mission technical analysis supporting national space assets.
- Develop cost-efficient on-premise and cloud HPC service offerings that align with mission and business objectives.
- Implement and enforce security best practices that comply with government regulations across both classified and unclassified environments.
- Harden Linux systems to meet stringent security requirements
What You Need to be Successful
Minimum Requirements for the Site Reliability Engineer Staff III:
- Bachelor's degree in Computer Science, Engineering, or equivalent experience.
- Minimum of 5 years' experience in Linux system administration within an enterprise HPC environment.
- In-depth knowledge of Linux, networking, and HPC systems.
- Proven experience in managing the Slurm scheduler and setting up HPC systems for both interactive and batch workloads.
- Proficiency in scripting and competence with automation tools such as Ansible or Puppet.
- Experience hardening Linux systems to meet security requirements
- Experience with hardware and infrastructure automation in environments using server vendors such as HPE or Cisco.
- Strong communication skills, with an ability to work both independently and as part of a geographically distributed team.
- CompTIA Security+ CE certification or equivalent that meets DoD 8570.01-m requirements for IAT Level II personnel
- Active TS/SCI clearance. U.S citizenship is required to obtain security clearance.
In addition to the above, the minimum requirements for the Site Reliability Engineer Staff IV include:
- 7+ years of experience in an enterprise HPC environment.
- Experience performing in-place upgrades of Slurm.
- Experience provisioning and supporting AI & NVIDIA GPU technologies
- Skill in provisioning and supporting AI & NVIDIA GPU technologies, with expertise in GPU integration, resource allocation, and scheduling using Slurm.
- Hands-on background with cloud HPC services
- Experience supporting a wide range of technical software (compilers, mod&sim tools, languages, COTS, GOTs) including the development of environment modules.
- Demonstrated ability to lead cross-functional teams and mentor junior engineers.
How You Can Stand Out
It would be impressive if you have one or more of these:
- Experience with AWS Parallel Computing Service (AWS ParallelCluster).
- Knowledge of NVLINK or NVSWITCH for optimizing GPU workflows.
- Familiarity with Prometheus and Grafana for monitoring and performance visualization.
- Expertise in optimizing and customizing Slurm partitions (queues) to balance utilization and reduce job wait times.
- Experience provisioning or supporting Slurm REST API
- Background in containerization within an HPC context used for data processing and technical analysis
- Proficiency with automation tools such as Ansible or Puppet for HPC
- Experience developing solutions that optimize data storage
- Experience managing parallel file systems such as Lustre.
- An active IC TS/SCI clearance with CI Polygraph.
We offer a competitive compensation package where you'll be rewarded based on your performance and recognized for the value you bring to our business. The grade-based pay range for this job is listed below. Individual salaries within that range are determined through a wide variety of factors including but not limited to education, experience, knowledge and skills.
(Min - Max)
$135,200 - $220,000
Pay Basis: Annual
Leadership Competencies
Our leadership philosophy is simple: every employee, regardless of level and role, can demonstrate leadership. At Aerospace, our commitment is our people. To cultivate our talent and ensure that we have a strong pipeline of future leaders, we want individuals who:
- Operate Strategically
- Lead Change
- Engage with Impact
- Foster Innovation
- Deliver Results
Ways We Reward Our Employees
During your interview process, our team will provide details of our industry-leading benefits.
Benefits vary and are applicable based on Job Type. A few highlights include:
- Comprehensive health care and wellness plans
- Paid holidays, sick time, and vacation
- Standard and alternate work schedules, including telework options
- 401(k) Plan - Employees receive a total company-paid benefit of 8%, 10%, or 12% of eligible compensation based on years of service and matching contributions; employees are immediately eligible and vested in the plan upon hire
- Flexible spending accounts
- Variable pay program for exceptional contributions
- Relocation assistance
- Professional growth and development programs to help advance your career
- Education assistance programs
- An inclusive work environment built on teamwork, flexibility, and respect
We are all unique, from various backgrounds and all walks of life, yet one thing bonds all of us to each other-the belief that we can make a difference. This core belief empowers us to do our best work at The Aerospace Corporation.
Equal Opportunity Commitment
The Aerospace Corporation is an equal opportunity employer. All qualified applicants will receive consideration for employment and will not be discriminated against on the basis of race, age, sex (including pregnancy, childbirth, and related medical conditions), sexual orientation, gender, gender identity or expression, color, religion, genetic information, marital status, ancestry, national origin, protected veteran status, physical disability, medical condition, mental disability, or disability status and any other characteristic protected by state or federal law. If you're an individual with a disability or a disabled veteran who needs assistance using our online job search and application tools or need reasonable accommodation to complete the job application process, please contact us by phone at 310.336.5432 or by email at [email protected] . You can also review Know Your Rights: Workplace Discrimination is Illegal.
Top Skills
AI
Ansible
AWS
Gpu
Grafana
Hpc
Linux
Lustre
Nvlink
Nvswitch
Prometheus
Puppet
Slurm
The Aerospace Corporation El Segundo, California, USA Office





View Gallery
2310 E. El Segundo Blvd., El Segundo, CA, United States, 90245
The Aerospace Corporation Pasadena, California, USA Office

200 S. Los Robles Ave, Pasadena, CA, United States, 91101
Similar Jobs at The Aerospace Corporation
Aerospace • Artificial Intelligence • Cloud • Machine Learning • Software • Cybersecurity • Defense
The HPC Data Storage Engineer develops and implements data management for HPC environments, provides technical support, optimizes storage systems, and collaborates with engineers and scientists.
Top Skills:
AnsibleCloud HpcData StorageLinuxLustreNasPuppetSan
Aerospace • Artificial Intelligence • Cloud • Machine Learning • Software • Cybersecurity • Defense
The HPC Data Storage Engineer develops and implements data management for HPC environments, provides technical support, optimizes storage systems, and collaborates with engineers and scientists.
Top Skills:
AnsibleCloud HpcData StorageLinuxLustreNasPuppetSan
Aerospace • Artificial Intelligence • Cloud • Machine Learning • Software • Cybersecurity • Defense
The Staff Attorney offers legal counsel, ensuring compliance with laws, resolving disputes, managing ethics programs, and advising on corporate transactions.
Top Skills:
ComplianceCorporate GovernanceCybersecurityEthicsGovernment RegulationsJuris Doctorate
What you need to know about the Los Angeles Tech Scene
Los Angeles is a global leader in entertainment, so it’s no surprise that many of the biggest players in streaming, digital media and game development call the city home. But the city boasts plenty of non-entertainment innovation as well, with tech companies spanning verticals like AI, fintech, e-commerce and biotech. With major universities like Caltech, UCLA, USC and the nearby UC Irvine, the city has a steady supply of top-flight tech and engineering talent — not counting the graduates flocking to Los Angeles from across the world to enjoy its beaches, culture and year-round temperate climate.
Key Facts About Los Angeles Tech
- Number of Tech Workers: 375,800; 5.5% of overall workforce (2024 CompTIA survey)
- Major Tech Employers: Snap, Netflix, SpaceX, Disney, Google
- Key Industries: Artificial intelligence, adtech, media, software, game development
- Funding Landscape: $11.6 billion in venture capital funding in 2024 (Pitchbook)
- Notable Investors: Strong Ventures, Fifth Wall, Upfront Ventures, Mucker Capital, Kittyhawk Ventures
- Research Centers and Universities: California Institute of Technology, UCLA, University of Southern California, UC Irvine, Pepperdine, California Institute for Immunology and Immunotherapy, Center for Quantum Science and Engineering