Remote, US
38 days ago
Director of Data and ML Engineering - Infinia
Welcome page Returning Candidate? Log back in! Director of Data and ML Engineering - Infinia Job Locations US-Remote Job ID 2024-4993 Name Linked Remote: US Country United States City Remote Worker Type Regular Full-Time Employee Overview

This is an incredible opportunity to be part of a company that has been at the forefront of AI and high-performance data storage innovation for over two decades. DataDirect Networks (DDN) is a global market leader renowned for powering many of the world's most demanding AI data centers, in industries ranging from life sciences and healthcare to financial services, autonomous cars, Government, academia, research and manufacturing.

  

"DDN's A3I solutions are transforming the landscape of AI infrastructure." – IDC 

  

“The real differentiator is DDN. I never hesitate to recommend DDN. DDN is the de facto name for AI Storage in high performance environments” - Marc Hamilton, VP, Solutions Architecture & Engineering | NVIDIA 

  

DDN is the global leader in AI and multi-cloud data management at scale. Our cutting-edge data intelligence platform is designed to accelerate AI workloads, enabling organizations to extract maximum value from their data. With a proven track record of performance, reliability, and scalability, DDN empowers businesses to tackle the most challenging AI and data-intensive workloads with confidence. 

  

Our success is driven by our unwavering commitment to innovation, customer-centricity, and a team of passionate professionals who bring their expertise and dedication to every project. This is a chance to make a significant impact at a company that is shaping the future of AI and data management. 

  

Our commitment to innovation, customer success, and market leadership makes this an exciting and rewarding role for a driven professional looking to make a lasting impact in the world of AI and data storage. 

Job Description

We are seeking an experienced and accomplished Director of Data and ML Engineering to lead our ML Engineering organization. In this role, you will oversee the design, deployment, and optimization of large-scale AI/ML training and inference pipelines using Infinia as foundational data storage as well as the development of connectors to main open-source frameworks for data ingestion and streaming, such as Delta Lake, Apache Iceberg, Mosaic Streaming, Ray Data. You will guide a talented organization of engineers focused on advanced end-to-end storage platform for data ingestion, transformation, preparation, and streaming on high-performance AI applications. Collaborating closely with software developers, product teams, and partners, you will lead experiments with state-of-the-art models using open-source tools and cloud platforms.

 

Key Responsibilities:

Leadership & Management:

Lead, mentor, and grow a team of senior ML and data engineers, fostering a culture of innovation and excellence.Set strategic direction for the ML engineering team in alignment with company goals.Lead strategic partnerships on all areas of AI, from conception to execution to delivering, communicating complex technical concepts to non-technical stakeholders effectively.Track, report, and manage the team’s performance against project milestones, ensuring on-time delivery of high-quality solutions.Partner with architects, engineers, and cross-functional teams to ensure the delivery of innovative, high-quality technical designs.Implement and refine engineering best practices, driving continuous improvements in quality, performance, and operational efficiency.

Technical Oversight:

Oversee the design and deployment of large-scale AI/ML training pipelines utilizing tools like Apache Spark and Apache Airflow.Guide the integration of MLflow with DDN’s Infinia product for comprehensive experiment tracking, model versioning, and deployment.Lead the integration of data ingestion and streaming pipelines open-source tools, like Delta Lake, Apache Iceberg, Ray Data, Mosaic Streaming, Tf.data, Torch Dataloader.Drive the implementation and scaling of Retrieval-Augmented Generation (RAG) pipelines to enhance generative model performance.Stay abreast of the latest developments in MLOps, AI/ML frameworks, and tooling.Identify and implement solutions to optimize pipeline performance, runtime, and resource utilization on Infinia.

Required Qualifications:

Bachelor’s or Master’s degree in Computer Science, Data Science, Machine Learning, or a related field.12+ years of experience in machine learning engineering, with at least 10 years in a leadership role.Proven track record of building and scaling AI/ML pipelines and managing high-performing engineering teams.Extensive experience with Apache Spark, Apache Airflow, and MLflow or equivalent tools.Deep understanding of machine learning frameworks and libraries (TensorFlow, PyTorch, NVIDIA NeMo).Experience deploying open-source vector databases at scale.Proficiency with containerization tools (Docker, Kubernetes) and infrastructure as code (Terraform, Ansible).Solid understanding of cloud infrastructure (AWS, GCP, Azure) and distributed computing.Excellent problem-solving and troubleshooting abilities with a keen eye for performance optimization.Strong leadership, communication, and interpersonal skills.Ability to drive strategic initiatives and manage multiple projects simultaneously.

Preferred Skills:

Experience with large-scale data processing and storage solutions (Hadoop, Hive, HDFS, Trino).Knowledge of NLP techniques and tools for model deployment.Implementation-level understanding of ML frameworks, data loaders, data formats, and table formats.Experience with scaling RAG pipelines and integrating them with generative AI models.Experience in operationalizing AI/ML models in production environments.

This role offers an exceptional opportunity to lead a high-impact engineering organization at the core of DDN’s cutting-edge storage solutions. If you are passionate about solving complex technical challenges and driving innovation in high-performance systems, we encourage you to apply.

DDN

Our team is highly motivated and focused on engineering excellence.

We look for individuals who appreciate challenging themselves and thrive on curiosity.

Engineers are encouraged to work across multiple areas of the company.

We operate with a flat organizational structure.

All employees are expected to be hands-on and to contribute directly to the company’s mission.

Leadership is given to those who show initiative and consistently deliver excellence. Work ethic and strong prioritization skills are important.

All engineers and researchers are expected to have strong communication skills.

They should be able to concisely and accurately share knowledge with their teammates.

 

Interview Process: After submitting your application, the team reviews your CV and statement of exceptional work. If your application passes this stage, you will be invited to a 30-minute interview (“phone interview”) during which a member of our team will ask some basic questions. If you clear the initial phone interview, you will enter the main process, which consists of four technical interviews:

Coding assessment in a language of your choice.Systems design: Translate high-level requirements into a scalable, fault-tolerant service.Systems hands-on: Demonstrate practical skills in a live problem-solving session.Project deep-dive: Present your past exceptional work to a small audience.Meet and greet with the wider team.Our goal is to finish the main process within one week.We don’t rely on recruiters for assessments.Every application is reviewed by a member of our technical team.

 

DataDirect Networks, Inc. is an Equal Opportunity/Affirmative Action employer.  All qualified applicants will receive consideration for employment without regard to race, color, religion, gender, gender identity, gender expression, transgender, sex stereotyping, sexual orientation, national origin, disability, protected Veteran Status, or any other characteristic protected by applicable federal, state, or local law.

 

#LI-Remote

Options Apply for this job onlineApplyShareRefer this job to a friendRefer Sorry the Share function is not working properly at this moment. Please refresh the page and try again later. Share on your newsfeed Application FAQs

Software Powered by iCIMS
www.icims.com

Confirm your E-mail: Send Email