Santa Clara, CA, USA
26 days ago
Principal Engineer, Agentic System Architecture

NVIDIA has been redefining computer graphics, PC gaming, and accelerated computing for more than 25 years. It’s a unique legacy of innovation that’s fueled by great technology—and outstanding people! Today, we’re tapping into the unlimited potential of AI to define the next era of computing. An era in which our GPU acts as the brains of computers, robots, and self-driving cars that can understand the world. Doing what’s never been done before takes vision, innovation, and the world’s best talent. As an NVIDIAN, you’ll be immersed in a diverse, supportive environment where everyone is inspired to do their best work. Come join the team and see how you can make a lasting impact on the world!

NVIDIA is seeking a principal engineer to craft the architecture of systems that apply agents, agentic frameworks, and LLMs. This role focuses on designing, developing, and deploying scalable, high-performance systems that integrate autonomous agents built using existing and future frameworks. The ideal candidate will have a strong background in architecting frameworks and a shown ability to build robust platforms that address both production and research needs. As a Principal Engineer, you will lead the architectural design of end-to-end systems that incorporate multi-agent workflows, with the overarching goal of creating frameworks and reference architectures that allow enterprises to unlock PB+ scale of data. This AI query engine will quickly allow any developer and enterprise to implement applications that improve the efficiency and lives of employees. You'll work closely with teams from different functions to ensure the scalability, security, and performance optimization of this reference architecture while offering mentorship to engineers and encouraging innovation in the agentic systems space.

What you'll be doing:

Architect and design large-scale systems of agentic frameworks (e.g., LangGraph, Llama Deploy, AutoGen, CrewAI) that form the base of an AI query engine.

Develop system-level solutions that use autonomous agents with existing infrastructure, ensuring scalability, reliability, and composability.

Evaluate and select appropriate foundational technologies for various use cases, ensuring optimal extensibility.

Ensure system robustness by implementing security standard processes, including those for LLM safety and reliability and remain at the forefront of advancements in LLMs, agentic frameworks, and distributed system design.

Provide technical leadership across projects and internal groups while mentoring engineers in system architecture design.

What we need to see:

BS or MS in Computer Engineering, Computer Science, or a closely related quantitative field (or equivalent experience).

15+ years of experience in software engineering or system architecture with a strong focus on Python or C++ development.

Proven track record of architecting production-level systems that use distributed computing

Self-starter with excellent collaboration skills and a history of working with cross-discipline teams including research and production.

Strong desire to work on the cutting edge of technology in a rapidly evolving environment while quickly learning and applying new technologies and libraries.

Ways to stand out from the crowd:

PhD (or equivalent) in Computer Engineering, Computer Science, or a closely related field.

Deep understanding of high-performance parallel computing, with experience in multi-threaded or multi-process environments.

Familiarity with AI frameworks (e.g., PyTorch, TensorFlow) and NVIDIA technologies (e.g., CUDA, TensorRT, Triton) and experience developing for GPU platforms and understanding of GPU architectures.

Proficiency in profiling Python code to optimize performance (e.g., asyncio vs threading vs multiprocessing).

Demonstrated history of contributing and building open-source software projects that address real enterprise challenges at PB+ scale.

Widely considered to be one of the technology world’s most desirable employers, NVIDIA offers highly competitive salaries and a comprehensive benefits package. As you plan your future, see what we can offer to you and your family - www.nvidiabenefits.com.

The base salary range is 272,000 USD - 419,750 USD. Your base salary will be determined based on your location, experience, and the pay of employees in similar positions.

You will also be eligible for equity and benefits. NVIDIA accepts applications on an ongoing basis.

NVIDIA is committed to fostering a diverse work environment and proud to be an equal opportunity employer. As we highly value diversity in our current and future employees, we do not discriminate (including in our hiring and promotion practices) on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law.

Confirm your E-mail: Send Email