Remote, US
1 day ago
Senior Manager, Sustaining Engineering
Welcome page Returning Candidate? Log back in! Senior Manager, Sustaining Engineering Job Locations US-Remote Job ID 2024-4961 Name Linked Remote: US Country United States City Remote Worker Type Regular Full-Time Employee Overview

This is an incredible opportunity to be part of a company that has been at the forefront of AI and high-performance data storage innovation for over two decades. DataDirect Networks (DDN) is a global market leader renowned for powering many of the world's most demanding AI data centers, in industries ranging from life sciences and healthcare to financial services, autonomous cars, Government, academia, research and manufacturing.

  

"DDN's A3I solutions are transforming the landscape of AI infrastructure." – IDC 

  

“The real differentiator is DDN. I never hesitate to recommend DDN. DDN is the de facto name for AI Storage in high performance environments” - Marc Hamilton, VP, Solutions Architecture & Engineering | NVIDIA 

  

DDN is the global leader in AI and multi-cloud data management at scale. Our cutting-edge data intelligence platform is designed to accelerate AI workloads, enabling organizations to extract maximum value from their data. With a proven track record of performance, reliability, and scalability, DDN empowers businesses to tackle the most challenging AI and data-intensive workloads with confidence. 

  

Our success is driven by our unwavering commitment to innovation, customer-centricity, and a team of passionate professionals who bring their expertise and dedication to every project. This is a chance to make a significant impact at a company that is shaping the future of AI and data management. 

  

Our commitment to innovation, customer success, and market leadership makes this an exciting and rewarding role for a driven professional looking to make a lasting impact in the world of AI and data storage. 

Job Description

We are looking for a Senior Manager for our L3 Sustaining Engineer team, which focuses on creating storage solutions for the most data-intensive workloads in the world, both HPC and AI/ML. The ideal candidate will have experience designing, implementing, and shipping software using Linux kernel development tooling and practices.


Key Responsibilities:

Handle customer escalations and direct the team to resolve working with different engineering teams.Drive incremental improvements to Incident Runbooks by creating processes and performing post-incident evaluations and storing them in Knowledge base..Managed and led a team for high pressure, mission-critical escalations, and incidents.Established strong relationships within and across peer teamsEfficient analysis of bug reports and development of software fixes on multiple platforms.Triage, diagnose, and troubleshoot problems in a professional and timely manner often working in production customer environments.Work with the Engineering managers and a geographically distributed team and customer base to ensure professional delivery and appropriate customer engagement and response.Assist with performance tuning of features for specific environments and use-cases.Involve product engineers when deep technical expertise is needed within a specific area of the product.Develop SLO/SLA processes and tools to accelerate problem analysis.Track and coordinate bug fixes and communicate status back to Professional Services, Support, and customers.Provide regular and ad hoc reports in an effective and timely manner.

 

Qualifications:

BS/MS in Computer Science, Computer Engineering or equivalent degree/experience.8+ years of software development experience with C in Linux environments.8+ years of experience working with enterprise-class or HPC storage systems and/or distributed systems.Strong team player with good communication skills and should be self-starter.Excellent time management skills, with the ability to prioritize, multitask, and work under deadlines in a fast-paced environment.Knowledge of Parallel File Systems, particularly Lustre, is highly preferred.Familiarity with Linux kernel VFS, IO, and the Ext4 file system is preferred.Experience with Git strongly preferred, and JIRA, Jenkins, Gerrit, and Github are assets. DDN

Our team is highly motivated and focused on engineering excellence.

We look for individuals who appreciate challenging themselves and thrive on curiosity.

Engineers are encouraged to work across multiple areas of the company.

We operate with a flat organizational structure.

All employees are expected to be hands-on and to contribute directly to the company’s mission.

Leadership is given to those who show initiative and consistently deliver excellence. Work ethic and strong prioritization skills are important.

All engineers and researchers are expected to have strong communication skills.

They should be able to concisely and accurately share knowledge with their teammates.

 

Interview Process: After submitting your application, the team reviews your CV and statement of exceptional work. If your application passes this stage, you will be invited to a 30-minute interview (“phone interview”) during which a member of our team will ask some basic questions. If you clear the initial phone interview, you will enter the main process, which consists of four technical interviews:

Coding assessment in a language of your choice.Systems design: Translate high-level requirements into a scalable, fault-tolerant service.Systems hands-on: Demonstrate practical skills in a live problem-solving session.Project deep-dive: Present your past exceptional work to a small audience.Meet and greet with the wider team.Our goal is to finish the main process within one week.We don’t rely on recruiters for assessments.Every application is reviewed by a member of our technical team.

 

DataDirect Networks, Inc. is an Equal Opportunity/Affirmative Action employer.  All qualified applicants will receive consideration for employment without regard to race, color, religion, gender, gender identity, gender expression, transgender, sex stereotyping, sexual orientation, national origin, disability, protected Veteran Status, or any other characteristic protected by applicable federal, state, or local law.

 

#LI-Remote

Options Apply for this job onlineApplyShareRefer this job to a friendRefer Sorry the Share function is not working properly at this moment. Please refresh the page and try again later. Share on your newsfeed Application FAQs

Software Powered by iCIMS
www.icims.com

Confirm your E-mail: Send Email