my Question in short is: Is the VRAM bandwidth of a GPU really the bottleneck regarding Deep Neural Networks?
Longer Version: A guy named Tim Dettmers wrote in his blog, that all the relevant operations on the graphics processor itself are faster then the bandwidth can provide new data. Seems reasonable. But now I did my own epxeriments yesterday and found out, that this is not the case with my GPU. It's a Nvidia GTX 560TI with 1GB VRAM. As you can see, it's quite a slow card and has not much VRAM. With AlexNet and 128x128 Images and batch size of 4 - bigger ones do not fit in my VRAM - the situation is like this: 2200 Mhz memory clock: 13:34 minutes 1650 Mhz memory clock: 14:17 minutes
Almost no speedup. 95,3% of the time with 1 1/3 memory clock speed. Now it could be the case, that in my situation the grahics processor is really to slow or the batch size is just to small and it's therefore bound by the PCIe bandwidth (PCIe 2.0). I tried it also on a own net with ~1 Million parameters and various batch sizes up to 1024 with 40x40 Images. There it made almost no difference.
After my not so representative experiment I'm still unsure what after all is really important regarding GPU performance.
Recent Questions...
ما را در سایت Recent Questions دنبال میکنید
برچسب:
نویسنده: استخدام کار
بازدید: 166
تاريخ: دوشنبه
10 خرداد
1395 ساعت: 20:54