Deep Learning Performance Architect

Published November 27, 2022
Location Shanghai, China
Category Deep Learning  
Job Type Full-time  


NVIDIA is developing processor and system architectures that accelerate machine learning, automotive and high performance computing (HPC) applications. We are looking for a technical expert to lead our DL performance projections and analysis effort.  This position offers the opportunity to make a meaningful impact in a fast-moving, technology focused company.


What you'll be doing:

  • Establish DL applications and use-cases for analysis and projections.
  • Specify hardware/software configurations and metrics to analyze performance, power, accuracy and resiliency in uniprocessor and multiprocessor configurations
  • Develop tools, infrastructure and methodologies for measurements, comparisons and reports.
  • Create and maintain worklaoads and micro-benchmark suites.
  • Generate projections, comparisons and analysis reports for internal/external consumption.
  • Collaborate across the company to guide the direction of next-gen deep learning HW/SW by working with architecture, software and product teams.


What we need to see:

  • MS or PhD in relevant discipline (CS, EE, CE).
  • 3+ years of experience with micro-benchmarks, profiling and architecture analysis.
  • Strong software skills with Python, MPI, OpenMP etc.
  • Familiarity with GPU computing and parallel programming models.
  • Excellent oral and written communication skills.
  • Good organizational, time management and task prioritization skills.