0% Complete
Home
/
14th International Conference on Computer and Knowledge Engineering
Improve the utility of tensor cores by compacting sparse matrix technique
Authors :
Mohammad.S Abazari
1
Mahsa Zahedi
2
Abdorreza Savadi
3
1- Ferdowsi university of mashhad
2- Ferdowsi university of mashhad
3- Ferdowsi university of mashhad
Keywords :
Tensor Cores،Neural Networks،Convolution Operations،Graphics Processing Unit
Abstract :
Neural networks have demanding computational requirements, particularly in matrix multiplication operations. To address this challenge, we propose a model that combines network pruning and matrix compression techniques. Our approach leverages NVIDIA's tensor cores, which excel at efficient matrix operations. We compress the network weights based on the tensor core structure and perform convolutions using the compressed weight matrix on the tensor cores. Our model incorporates neural network pruning, mixed-precision training, and compression of network weight tensors using the im2col algorithm and CSR format. We also utilize tensor kernels with a block size of 16x16 for multiplication. We evaluate the performance of various models, including pruned, AMP-optimized, combined pruning and AMP techniques, and our proposed model. Our evaluation reveals a significant improvement in performance compared to a simple baseline model. Through an extensive analysis of related works, we establish foundational concepts, present our proposed model, and share the obtained results.
Papers List
List of archived papers
An Interactive Approach for Query-based Multi-Document Scientific Text Summarization
Mohammadsadra Nejati - Azadeh Mohebi - Abbas Ahmadi
A New Hypercube Variant: Pruned Shuffle Connected Cube
Reza Latifi - Mahmoud Naghibzadeh
Novel Insights in Deep Learning for Predicting Climate Phenomena
Mohammad Naisipour - Saghar Ganji - Iraj Saeedpanah - Behnam Mehrakizadeh - Ahmad Reza Labibzadeh
Parallel Local Feature Selection For High-dimensional Data
Zhaleh Manbari - Chiman Salavati - Fardin AkhlaghianTab - Barzan Saeedpoor - Himan Delbina - Mahmud Abdulla Mohammad
A supervised approach using transformer networks for the detection of turning-related anomalies in urban intersections
Mohammad Mahdi HajiAbadi - Manoochehr Nahvi
Intelligent Resource Collision Management for Cellular Vehicular Systems Using Software-Defined Networking
Mohammad Kazemiesfeh - Neda Moghim - Ahmadreza Montazerolghaem
Fatty Liver Level Recognition Using Particle Swarm Optimization (PSO) Image Segmentation and Analysis
Seyed Muhammad Hossein Mousavi - Vyacheslav Lyashenko - Atiye Ilanloo - S. Younes Mirinezhad
Ramp Progressive Secret Image Sharing using Ensemble of Simple Methods
Atieh Mokhtari - Mohammad Taheri
Multi-Layered Defense Against Modern Phishing: A Dual-Sandbox and CDR Approach
Mahdi Seyfipoor - Mohammad Mahdi Eskandari
Token-Based Access Control for Inter-organization Collaboration in Hyperldger Fabric
Parsa Hedayatnia - Mohammad Ata Jalilian - Mohammad Allahbakhsh - Haleh Amintoosi
more
Samin Hamayesh - Version 43.7.0