Bellevue, Washington, USA
1 day ago
HPC System Engineer

TITLE:  HPC System Engineer

LOCATION:     Bellevue, WA

TerraPower is a nuclear technology company based in Bellevue, Washington. At its core, the company is working to raise living standards globally through a more affordable, secure and environmentally friendly form of nuclear energy along with innovations in medical isotopes to improve human health. In 2006, TerraPower originated with Bill Gates and a group of like-minded visionaries who evaluated the fundamental challenges to raising living standards around the world. They recognized energy access was crucial to the health and economic well-being of communities and decided that the private sector needed to take action and create energy sources that would advance global energy deployment. TerraPower’s mission is to be a world leader in new nuclear technologies, while developing innovators and future leaders in the nuclear field. As a result, the company’s activities in the fields of nuclear energy and related sciences are yielding significant innovations in the safety and economics of nuclear power, hybrid energy and medical applications – all for significant human health benefits.

TerraPower is seeking to hire highly motivated and forward-thinking professionals who are interested in focusing on advanced nuclear reactor research and development and influencing change within the nuclear power landscape and bringing forward the critical production of medical isotopes.  TerraPower is an Equal Opportunity Employer. We do not discriminate in hiring on the basis of sex, gender identity, sexual orientation, race, color, religious creed, national origin, physical or mental disability, protected Veteran status, or any other characteristic protected by federal, state, or local law. In addition, as a federal contractor, TerraPower has instituted an Affirmative Action Plan (AAP) in an effort to proactively recruit, hire, and promote women, minorities, disabled persons and veterans.

HPC System Engineer

The HPC (High-Performance Computing) System Engineer is responsible for partnering with internal customers who are leveraging complex advanced physics and engineering applications with the goal to provide a best-in-class end-user experience through resilient, capable platforms and solutions. Role will supply technical input to the infrastructure requirements and provide technical support, vendor management and training to users of these resources for advanced nuclear. This position requires an experienced candidate who will promote collaboration and cooperation while working with multiple engineering disciplines.

Responsibilities:

•    Maintain Linux HPC Supercomputer systems availability to the customer, including in Azure Gov and on-prem infrastructure.

•    Administer and maintain Linux based system software and firmware revisions, including patches, updates, and OS upgrades.

•    Solve Linux system hardware, software, and third-party software issues, and provide detailed and thoughtful analysis of problem and resolution.

•    Automate configuration management of infrastructure and applications, software updates, and maintenance and monitoring of system availability using modern DevOps tools (Ansible, GitHub, etc.)

•    Installation, configuration, tuning, troubleshooting, and administration of commercial off-the-shelf (COTS), Open Source, and in-house developed applications leveraging HPC resources.

•    Packaging, deployment, and management of software leveraging environment modules.

•    Coordinate HPC infrastructure solutions and plan for growth.

•    Actively connect with management regarding any problems with the equipment and propose resolution.

•    Partner with IT Principal Engineering to define and execute roadmaps. Assist with gathering data for new feature, system, and/or advanced computing requirements from key stakeholders.  Provide timely estimates for implementation delivery. Anticipate risks when planning and defining mitigation options.

•    Respond to user queries regarding computing resources.

Key Qualifications and Skills

•    BA, BS, or MS in CS, EE, CE or equivalent experience.

•    5+ years of previous experience deploying and administrating production HPC clusters.

•    Experience with managing an HPC resource scheduler (Slurm preferred).

•    Proven track record to script in Bash or Python.

•    Experience with MPI software and high-speed interconnects in HPC supercomputers.

•    Experience with containers for HPC (Docker, Singularity, Apptainer).

•    Deep understanding of operating systems, computer networks, and high-performance applications.

•    Ability to work well with developers & test engineers.

•    Proficiency in programming language such as Python, Fortran, C++, or R with the ability to learn from others as required.

•    Proficient in using the Linux operating systems.

•    Ability to multi-task and work cooperatively with others.

•    The successful candidate will possess a high degree of trust and integrity, communicate openly and effectively and display respect with a desire to foster teamwork.

Job Functions

Job Functions are physical actions and/or working conditions associated with the position.  These functions may also constitute essential functions for the job which the employee must be able to fulfill, with or without accommodation.  Information provided below is to help describe the job so that the applicant has a reasonable understanding of the job duties/expectations.  An applicant's ability to perform and/or tolerate these actions and conditions will be discussed, and workplace accommodations may be made on a case-by-case basis following an individualized assessment of the applicant and other considerations, including but not limited to any governing safety standards.

•    Motor Abilities: Sitting and/or standing for extended periods, bending/stooping, grasping/gripping, fine motor control (hands)

•    Physical exertion and/or requirements: Minimal, with ability to safely lift up to 25 pounds

•    Repetitive work: Prolonged

•    Special Senses: Visual and audio focused work

•    Work Conditions: Stairs, typing/keyboard, standard and/or sitting working environment of >8 hrs./day

•    Travel required: 0-5%

TerraPower’s technology is controlled for export by various agencies of the U.S. Government.  TerraPower must evaluate applicants who are foreign nationals (other than asylees, refugees, or lawful permanent residents) in accordance with U.S. Government export control requirements.  To facilitate TerraPower’s export control reviews, you will be asked as part of the application process to identify whether you are a U.S. Citizen or national, asylee, refugee, or lawful permanent resident of the United States.  Government export authorization approval times vary.  Based on the business needs for a particular position, TerraPower may not consider a foreign national from a country if it is impracticable to obtain timely Government export approval.

Job details

Salary Range Level 10: $ 113,605 - $ 170,408

*Typically, our employee salaries are within .90 – 1.0 of the mid-point of the posted salary band.  Any salary offered within the posted salary band is based on market data and commensurate with the selected individual’s qualifications and experience.  This range is specific to Washington State.

Job Type:  Full-time

Benefits:

•    Competitive Compensation

•    Salary, eligible to participate in discretionary short-term incentive payments

•    Comprehensive Medical and Wellness Benefits for family or individual

o         Vision

o         Dental

o         Life

o         Life and Disability

o         Gender Affirmation Benefits

o         Parental Leave

•    401k Plan

•    Generous Paid Time Off (PTO)

o         21 days of annually accrued PTO

•    Generous Holiday Schedule

o         10 paid holidays

•    Relocation Assistance

•    Professional and Educational Support Opportunities

•    Flexible Work Schedule

TerraPower Career information:  https://www.terrapower.com/contact-us/careers/

Confirm your E-mail: Send Email