We are seeking experienced C++ Software Engineers to join our Spark Acceleration group.
Data scientists spend a considerable amount of time exploring data, iterating over machine learning (ML) experiments. NVIDIA believes that data science workflows can benefit tremendously from being accelerated, to enable data scientists to explore many more and larger datasets to drive towards their business goals, faster, and more reliably.
You will work with the open source community to accelerate Apache Spark for data science. Apache Spark is the most popular data processing engine in data centers for data science. We aim to dramatically accelerate Apache Spark use cases without application code changes. You will work on open source libraries including Spark-RAPIDS (https://github.com/NVIDIA/spark-rapids), RAPIDS (https://github.com/rapidsai), and Velox (https://github.com/facebookincubator/velox).
What you'll be doing:
Design and implement native Spark execution engine using RAPIDS, Velox, UCX and other related libraries. Design and implement solutions to optimize data exchange between Velox and RAPIDS librariesEnhance Velox OSS library for improved performance and Spark compatibilityContribute to RAPIDS library for large-scale adoptions in major enterprisesConduct performance benchmarking and profiling to achieve speed-of-light performanceWorking with a team of exceptional engineers including PMC and Committers of Apache Spark, Apache Hadoop, Apache Hive, and Apache ArrowPresenting technical solutions in industry conferences and meetupsWhat we need to see:
BS, MS, or PhD in Computer Science, Computer Engineering, or closely related field 8+ years of work or research experience in software development3+ years hands-on development experience with Velox, RAPIDS or similar data processing frameworks in memory management techniques and data serializationExceptional C++ development experience in design, programming, testing, and debugging Design and development expertises in columnar data processing with SIMD (Single Instruction, Multiple Data) and vectorization techniquesFamiliarity with operating systems and software development environments for ARMProven technical skills in designing and implementing high-quality distributed systems Able to work successfully with multi-functional teams across organizational boundaries and geographiesHighly motivated with strong communication skillsWays to stand out from the crowd:
Committership at major open source big-data projects Working experience with GPU-accelerated libraries (CUDA, cuBLAS, NCCL, RAPIDS, UCX)We are an AA/EEO/Disabled employer and with highly competitive salaries and a comprehensive benefits package, NVIDIA is widely considered to be one of the technology world’s most desirable employers. We have some of the most brilliant and talented people on the planet working for us. Are you creative and autonomous? Do you love a challenge? If so, we want to hear from you.