I have been developing a neural network API for GPUs. It is based on CUDA technology and I tried to exploit the full potential of the computer.
Most of us, do not know the real potential of GPUs in AI. This kind of hardware is specialized in logical operations and it was used by the industry of engines and videogames. Some years ago, we have experienced a ‘boom’ in graphics cards, the improvement in speed, price and the ease of programming had triggered a new mother-lode.
GPUs provide a bunch of parallelism algorithms to improve performance.
As we know, neural networks use matrix-to-matrix and matrix-to-vector operations. We can take advance in the optimization of them. Unfortunately, feed-forward and backpropagation are linear functions, they can neither be optimized nor parallelized.
For my neural network training, I have used: NVIDIA GEFORCE GTX 1070
- 1920 cores
- 1.7 GHz
I used MNIST dataset to train the deep neural network (DNN) for data classification.
Topology: 728 x 20 x 10
Time: 1.65 seconds / Batch size: 1000
The batch size is directly proportional to the time.
|Batch size||Time (s)|