Senior Performance Software Engineer, Deep Learning Libraries
Nvidia
Multiple Locations
We are now looking for a Senior Performance Software Engineer for Deep Learning Libraries! Do you enjoy tuning parallel algorithms and analyzing their performance? If so, we want to hear from you! As a deep learning library performance software engineer, you will be developing optimized code to accelerate linear algebra and deep learning operations on NVIDIA GPUs.
What you'll be doing:
- Writing highly tuned compute kernels, mostly in C++ CUDA, to perform core deep learning operations (e.g. matrix multiplies, convolutions, normalizations);
- Following general software engineering best practices including support for regression testing and CI/CD flows;
- Collaborating with teams across NVIDIA:
- CUDA compiler team on generating optimal assembly code;
- Deep learning training and inference performance teams on which layers require optimization;
- Hardware and architecture teams on the programming model for new deep learning hardware features.
What we need to see:
- PhD degree or equivalent experience in Computer Science, Computer Engineering, Applied Math, or related field or a Bachelors or Masters degree plus 4-6 years of equivalent relevant industry experience;
- Demonstrated strong C++ programming and software design skills, including debugging, performance analysis, and test design;
- Experience with performance-oriented parallel programming, even if it’s not on GPUs (e.g. with OpenMP or pthreads);
- Solid understanding of computer architecture and some experience with assembly programming.
Ways to stand out from the crowd:
- Tuning BLAS or deep learning library kernel code;
- CUDA/OpenCL GPU programming;
- Numerical methods and linear algebra;
- LLVM, TVM tensor expressions, or TensorFlow MLIR.
Apply Now
Don't forget to mention EuroTechJobs when applying.