Ong, Chin Tong (2015) Block-based neural network mapping on graphics processor unit. Masters thesis, Universiti Teknologi Malaysia, Faculty of Electrical Engineering.
|
PDF
207kB |
Official URL: http://dms.library.utm.my:8080/vital/access/manage...
Abstract
Block-based neural network (BbNN) was introduced to improve the training speed of artificial neural network. Various works had been carried out by previous researchers to improve training speed of BbNN system. Multithread BbNN training on field-programmable gate array (FPGA) limits training speed due to low performance of Nios II software used for communication between central processing unit (CPU) and FPGA. This project aims to improve training speed of multithread BbNN block by mapping BbNN model into Compute Unified Device Architecture (CUDA) core. In this project, each BbNN block is mapped into a CUDA core with each core running on a single thread. The functional verification of BbNN core is carried out based on the BbNN output accuracy value. Near 100 percent accuracy value obtained is used to verify the CUDA mapped BbNN. The performance trade-off analysis had been carried out by comparing the accuracy value obtained from BbNN evolution on GPU versus CPU implementations. From the results obtained, it is found out that the performance of CUDA-mapped BbNN can only be as fast as CPU-mapped implementation. Although CUDA-mapped BbNN implementation run multiple BbNN blocks training in parallel, large data transfer between CPU and GPU dominates the performance gain in training multiple BbNN blocks in parallel. Besides that, a significant gain in training speed can only be seen if the order of complexity for GPU execution is at a higher order compared to the order of CPU-GPU data transfer. The result obtained in this project provides recommendation for future research works on how to further improve the training speed of CUDA-base BbNN implementation.
Item Type: | Thesis (Masters) |
---|---|
Additional Information: | Thesis (Sarjana Kejuruteraan (Elektrik - Komputer dan Sistem Mikroelektronik)) - Universiti Teknologi Malaysia, 2015; Supervisor : Assoc. Prof. Dr. Muhammad Nadzir Marsono |
Uncontrolled Keywords: | field-programmable gate array (FPGA), compute unified device architecture (CUDA) |
Subjects: | T Technology > TK Electrical engineering. Electronics Nuclear engineering |
Divisions: | Electrical Engineering |
ID Code: | 53959 |
Deposited By: | Fazli Masari |
Deposited On: | 06 Apr 2016 07:54 |
Last Modified: | 08 Oct 2020 04:38 |
Repository Staff Only: item control page