International Conference on Computer and Knowledge Engineering

Home / 13th International Conference on Computer and Knowledge Engineering

Pruning and Mixed Precision Techniques for Accelerating Neural Network

Authors :

Mahsa Zahedi¹ Mohammad Sediq Abazari Bozhgani² Abdorreza Savadi³

1- Department of Computer Engineering Ferdowsi University of Mashhad Mashhad, Iran 2- Department of Computer Engineering Ferdowsi University of Mashhad Mashhad, Iran 3- Department of Computer Engineering Ferdowsi University of Mashhad Mashhad, Iran

Keywords :

Prune،Mixed Precision،Neural Network،Machine Learning،image processing

Abstract :

This study investigates the use of pruning and mixed precision techniques to enhance neural network performance, focusing on the AlexNet model trained on the MNIST dataset. Pruning removes unnecessary components, while mixed precision optimizes memory and computation efficiency. The study applies structured pruning to create a pruned model, which achieves improved inference time compared to the baseline. Automatic mixed precision is also employed, further enhancing inference speed. Combining pruning and mixed precision in a single model yields superior performance, surpassing the individual approaches. The combined model achieves significantly faster inference time by leveraging both techniques. The research highlights the potential of combining pruning and mixed precision for faster and more efficient neural network computations, reducing network size, optimizing memory utilization, and accelerating computations. The findings provide valuable insights for integrating these techniques into the AlexNet model and lay the groundwork for future exploration in larger and more complex models. This work holds promise for developing faster and more efficient deep learning models to meet the demands of real-world applications, particularly in resource-constrained environments