From Nvidia Interveiw to Little’s Law & Roofline Model

Last Thursday, I had my 4th interview for an architecture intern position in Nvidia. There were 3 interesting questions valuable to share; first two questions that they asked me, then the question that I asked them.


Questions from the interviewer:

Q1: Why GPU runs so many threads?

A1: To hide memory latency.


This question is actually easy, but the next one he asked really makes me “suffered”.


