0% Complete
Home
/
12th International Conference on Computer and Knowledge Engineering
FAST: FPGA Acceleration of Neural Networks Training
Authors :
Alireza Borhani
1
Mohammad Hossein Goharinejad
2
Hamid Reza Zarandi
3
1- Department of Computer Engineering, Amirkabir university of technology
2- Department of Computer Engineering, Amirkabir university of technology
3- Department of Computer Engineering, Amirkabir university of technology
Keywords :
Field Programmable Gate Array،Embedded Devices،Artificial Neural Network،Machine Learning،Approximation
Abstract :
Training state-of-the-art ANNs is computationally and memory intensive. Thus, implementing the training on embedded devices with limited resources is challenging. In order to address this challenge, we propose FAST, a low-precision method to implement and optimize ANN training on FPGA. FAST first addresses the challenge of implementing the non-polynomial sigmoid activation function by presenting a solution using PNLA methods. Then, it introduces Hardware Optimized PReLU (HOPE) activation function, which is specifically devised to reduce the required resources and increase the accuracy of computations on FPGA. We evaluated FAST against the software implementations of ANNs, using training tasks available in the MNIST benchmark. The results show that FAST improves the training speed by 8.6× and reduces the required memory size by orders of magnitude. It is worthwhile to mention that the method imposes almost no degradation in training accuracy.
Papers List
List of archived papers
An Effective Connectomics Approach for Diagnosing ADHD using Eyes-open Resting-state MEG
Nastaran Hamedi - Ali Khadem - Sajjad Vardast - Mehdi Delrobaei - Abbas Babajani-Feremi
Optimizing Question-Answering Framework Through Integration of Text Summarization Model and Third-Generation Generative Pre-Trained Transformer
Ervin Gubin Moung - Toh Sin Tong - Maisarah Mohd Sufian - Valentino Liaw - Ali Farzamnia - Farashazillah Yahya
Binary Classification of Capuchin Bird Calls via Spectrogram-Enhanced Frequency-Aware Convolutional Neural Networks
Samad Najjar-Ghabel - Shamim Yousefi - Reza Danandeh Bileh Savar
AvashoG2P: A multi-module G2P Converter for Persian
Ali Moghadaszadeh - Fatemeh Pasban - Mohsen Mahmoudzadeh - Maryam Vatanparast - Amirmohammad Salehoof
Learning to Classify Messier Astronomical Objects with Limited Data: A Few-Shot Learning Approach
AMIRREZA ROUHBAKHSHMEGHRAZI - Shayan Nalbandian - Ghazal Alizadeh - Sheida Shadman - Shuyuan Yang - Bo Li
GroupRec: Group Recommendation by Numerical Characteristics of Groups in Telegram
Davod Karimpour - Mohammad Ali Zare Chahooki - Ali Hashemi
Adaptive Channel Estimation for MIMO-OFDM Systems in Impulsive Noise Environments
Mojtaba Hajiabadi
A Survey of the AVOA Metaheuristic Algorithm and its Suitability for Power System Optimization and Damping Controller Design
Aliyu Sabo - Theophilus Ebuka Odoh - Samuel Habu - Hossien Shahinzadeh - Farshad Ebrahimi
Improve the utility of tensor cores by compacting sparse matrix technique
Mohammad.S Abazari - Mahsa Zahedi - Abdorreza Savadi
Swin-RSCBNet: A Transformer-Based Network for Skin Cancer Segmentation with Multi-Scale and Attention Modules
Benyamin Mirab Golkhatmi - Mostafa Heydari - Mahboobeh Houshmand - Seyyed Abed Hosseini
more
Samin Hamayesh - Version 43.7.0