Job Description
This is a high visibility position within engineering that will perform the system administrator role of current High-Performance Computing (HPC) systems while also playing a key role in defining the path forward for HPC at Virgin Galactic
You will work with internal users across several functional areas to maximize the performance of the current systems and remove roadblocks to access and utilization
You will provide strategic insight into the path forward for HPC including data integrity planning and digital thread integration
The role will also require you to work with external vendors to maintain the existing infrastructure and to lead expansion and upgrade work
The right candidate will balance technically advanced approaches and bootstrapped innovation that will allow for cost-effectiveness.
Responsibilities
This role is both systems-facing and user-facing
In it, you will use your in-depth knowledge of Linux, your cluster administration experience, and your passion for supporting ground-breaking engineering work daily
Your role is crucial in designing, implementing, and maintaining our advanced computing infrastructure.
* HPC Infrastructure Maintenance: Manage the day-to-day system administration of Linux-based cluster computing and storage environments, and associated network infrastructure, in alignment with applicable company, regulatory agency, and/or contractual security and privacy requirements.
* Ensures users have the environment, tools, compilers, and any additional resources needed to deploy applications across the clusters, including open source, proprietary, and in-house developed codes.
* Slurm: Responsible for all aspects of management of Slurm for efficient resource allocation and job scheduling across the clusters
This includes managing job accounting databases and generating utilization reports.
* User Support: Collaborate with colleagues and team members to understand their computing needs, provide technical assistance, and troubleshoot issues related to system performance and job execution
Provide user consultation and training.
* Performance monitoring: Monitor system performance, diagnose bottlenecks, and take necessary actions to improve system performance.
* Documentation: Maintain detailed documentation of system configurations, procedures, and troubleshooting guides to facilitate knowledge sharing and team collaboration
Develop user-facing documentation.
* Planning: Meet regularly with internal and external stakeholders to understand existing challenges, anticipated needs, and opportunities for closer collaboration
Develop a roadmap for system improvements and life cycling, making recommendations to leadership
Creation of data integrity plans as well as a strategy for data integration into the digital thread.
Required Skills and Experience
* Relevant bachelor’s degree and 10 years of increasingly technical work experience or a combination of education and relevant experience.
* In-depth experience managing multiuser HPC clusters and distributed storage environments.
* Working knowledge of engineering simulation tools such as CFD, FEM, and heat transfer codes that typically run on clusters.
* Independent and proactive working style.
* Demonstrated ability to communicate with a diverse set of stakeholders.
* This position requires in-depth knowledge of and hands-on experience with:
* Linux cluster system administration (RedHat/CentOS/Rocky)
* SLURM configuration and management
* Active Directory authentication for Linux systems
* SMB file shares between Windows and Linux systems
* BeeGFS configuration and management
* Scripting for system management and task automation
* Networking technologies (Infiniband, Message Passing Interfaces)
* Installing and repairing servers and associated cluster hardware
* Problem-solving and troubleshooting
* Experience with stateless node management and provisioning (OpenHPC/Warewulf)
* Experience with the proprietary ACT ClusterVisor tools
* Experience with hybrid on-prem/cloud cluster technologies and containerization in the context of HPC
* Tape backup systems
* Working knowledge of Digital Thread concepts
* Working knowledge of the 3DEXPERIENCE platform
The annual U.S
base salary range for this full-time position is $132,100.00-$201,550.00
The base pay actually offered will vary depending on job-related knowledge, skills, location, and experience and take into account internal equity
Other forms of pay (e.g., bonus or long term incentive) may be provided as part of the compensation package, in addition to a full range of medical, financial, and other benefits, dependent on the position offered
For more information regarding Virgin Galactic benefits, please visit
/>
Who We Are
Virgin Galactic is transforming humanity’s relationship with space
By making it more open and accessible, we are connecting the world to the love, wonder and awe inspired by space travel, helping to create new opportunities for the benefit of life on Earth
Whether it’s supporting cutting-edge research missions for scientists and students, or offering life-changing experiences for the adventurers among us, Virgin Galactic is THE spaceline for Earth
Such an audacious vision requires a team as driven as they are curious – one capable of redefining the boundaries of what’s possible.
Export Requirements
To conform to U.S
Government export regulations, applicant must be a U.S
Person (either a U.S
citizen, a lawful permanent resident or a protected individual as defined 8 U.S.C
1324b(a)(3) or be able to obtain the required authorization from either the U.S
Department of State or the U.S
Department of Commerce
The applicant must also not be included in the list of Specifically Designated Nationals and Blocked Persons maintained by the Office of Foreign Assets Control
See list here.
EEO Statement
Virgin Galactic is an Equal Opportunity Employer; employment with Virgin Galactic is governed on the basis of merit, competence and qualifications and will not be influenced in any manner by race, color, religion, gender, gender identity, national origin/ethnicity, veteran status, disability status, age, sexual orientation, marital status, mental or physical disability or any other legally protected status.
DRUG FREE WORKPLACE
Virgin Galactic is committed to a Drug Free Workplace
All applicants post offer and active teammates are subject to testing for marijuana, cocaine, opioids, amphetamines, PCP, and alcohol when criteria is met as outlined in our policies
This can include pre-employment, random, reasonable suspicion, and accident related drug and alcohol testing.
PHOENIX EMPLOYMENT REQUIREMENTS
For individuals seeking employment at our Phoenix Mesa Gateway Airport facility, employment is contingent upon you obtaining and maintaining a TSA authorized security badge
This includes initial and annual mandatory background checks that are governed by TSA, and conducted by the Phoenix Mesa Gateway Airport badging office.
Recommended Skills
- Administration
- Automation
- Consulting
- Curiosity
- Data Integration
- Databases