0% Complete
Home
/
14th International Conference on Computer and Knowledge Engineering
Improve the utility of tensor cores by compacting sparse matrix technique
Authors :
Mohammad.S Abazari
1
Mahsa Zahedi
2
Abdorreza Savadi
3
1- Ferdowsi university of mashhad
2- Ferdowsi university of mashhad
3- Ferdowsi university of mashhad
Keywords :
Tensor Cores،Neural Networks،Convolution Operations،Graphics Processing Unit
Abstract :
Neural networks have demanding computational requirements, particularly in matrix multiplication operations. To address this challenge, we propose a model that combines network pruning and matrix compression techniques. Our approach leverages NVIDIA's tensor cores, which excel at efficient matrix operations. We compress the network weights based on the tensor core structure and perform convolutions using the compressed weight matrix on the tensor cores. Our model incorporates neural network pruning, mixed-precision training, and compression of network weight tensors using the im2col algorithm and CSR format. We also utilize tensor kernels with a block size of 16x16 for multiplication. We evaluate the performance of various models, including pruned, AMP-optimized, combined pruning and AMP techniques, and our proposed model. Our evaluation reveals a significant improvement in performance compared to a simple baseline model. Through an extensive analysis of related works, we establish foundational concepts, present our proposed model, and share the obtained results.
Papers List
List of archived papers
Adaptive Active Queue Management for Time Slot Channel Hopping in Industrial Internet of Things
Mehdi Zirak - Yasser Sedaghat - Mohammad Hossein Yaghmaee Moghaddam
A Comprehensive Dataset of Real-scene Images for Text Detection and Recognition in Persian
Iman Souzanchi - Ramin Rahimi - Mohammad Ali Majidi Anvari - Atefeh Baniasadi - Ashkan Sadeghi - Mohammad Reza Mohammadi
A Novel Deformable Registration Method for Cerebral Magnetic Resonance Images
Bahareh Asadpour Dasht Bayaz - Mahdi Saadatmand - Fabrice Wallois
Segmentation of Coronary Artery Stenosis in X-ray Angiography using Mamba Models
Fatemeh Fouladi - Ali Rostami - Hedieh Sajedi
Classification of Audio Streaming in Network Traffic Based on Machine Learning Methods
Mohammad Nikbakht - Mehdi Teimouri
A 2D-CNN Architecture for Improving the Classification Accuracy of an Electronic Nose with Different Sensor Positions
Hannaneh Mahdavi - Reza Goldoust - Saeideh Rahbarpour
Leveraging the Power of Object Detection Models in Identifying Litter for a Significant Reduction in Environmental Pollution
Lim Zhen Xian - Ervin Gubin Moung - Jason Teo Tze Wi - Nordin Saad - Farashazillah Yahya - Tiong Lin Rui - Ali Farzamnia
An Analysis of Botnet Detection Using Graph Neural Network
Faezeh Alizadeh - Mohammad Khansari
Non-Functional Requirement Extracting Methods for AI-based Systems: A Survey
Reza Damirchi - Amineh Amini
Islamic Geometric algorithms: A survey
Elham Akbari - Azam Bastanfard
more
Samin Hamayesh - Version 42.2.1