0% Complete
Home
/
14th International Conference on Computer and Knowledge Engineering
Improve the utility of tensor cores by compacting sparse matrix technique
Authors :
Mohammad.S Abazari
1
Mahsa Zahedi
2
Abdorreza Savadi
3
1- Ferdowsi university of mashhad
2- Ferdowsi university of mashhad
3- Ferdowsi university of mashhad
Keywords :
Tensor Cores،Neural Networks،Convolution Operations،Graphics Processing Unit
Abstract :
Neural networks have demanding computational requirements, particularly in matrix multiplication operations. To address this challenge, we propose a model that combines network pruning and matrix compression techniques. Our approach leverages NVIDIA's tensor cores, which excel at efficient matrix operations. We compress the network weights based on the tensor core structure and perform convolutions using the compressed weight matrix on the tensor cores. Our model incorporates neural network pruning, mixed-precision training, and compression of network weight tensors using the im2col algorithm and CSR format. We also utilize tensor kernels with a block size of 16x16 for multiplication. We evaluate the performance of various models, including pruned, AMP-optimized, combined pruning and AMP techniques, and our proposed model. Our evaluation reveals a significant improvement in performance compared to a simple baseline model. Through an extensive analysis of related works, we establish foundational concepts, present our proposed model, and share the obtained results.
Papers List
List of archived papers
Mitochondrial Segmentation in Microscopy Images Using UNet-VGG19
Zerek Sediq Hossein - Rojiar Pir Mohammadiani - Saadat Izadi
A New Inter-layer Similarity metric for link prediction in multilayer networks
Alireza Abdollahpouri - Samira Rafiee
AVID: A VARIATIONAL INFERENCE DELIBERATION FOR META-LEARNING
Alireza Javaheri - Arsham Gholamzadeh Khoee - Saeed Reza Kheradpisheh - Hadi Farahani - Mohammad Ganjtabesh
Averting Mode Collapse for Generative Zero-Shot Learning
Shayan Ramazi - Setare Shabani
Farsi Text in Scene: A new dataset
Ali Salmasi - Ehsanollah Kabir
A scalable blockchain-based educational network for data storage and assessment
Maryam Fattahi Vanani - Hamidreza Shayegh Borujeni - Ali Nourollah
TCAR: Thermal and Congestion-Aware Routing Algorithm in a Partially Connected 3D Network on Chip
Majid Nezarat - Masoomeh Momeni
Divide and Conquer Approach to Long Genomic Sequence Alignment
Mahmoud Naghibzadeh - Samira Babaei - Behshid Behkmal - Mojtaba Hatami
Segmentation of Coronary Artery Stenosis in X-ray Angiography using Mamba Models
Fatemeh Fouladi - Ali Rostami - Hedieh Sajedi
Robust Learning to Learn Graph Topologies
Navid Akhavan Attar - Ali Fahim
more
Samin Hamayesh - Version 41.5.3