0% Complete
Home
/
14th International Conference on Computer and Knowledge Engineering
Distilled BERT Model In Natural Language Processing
Authors :
Yazdan Zandiye Vakili
1
Avisa Fallah
2
Hedieh Sajedi
3
1- School of Mathematics, Statistics and Computer Science, University of Tehran, Tehran, Iran
2- School of Mathematics, Statistics and Computer Science, University of Tehran, Tehran, Iran
3- School of Mathematics, Statistics and Computer Science, University of Tehran, Tehran, Iran
Keywords :
NLP،Machine Learning،Distillation،BERT،Transformers
Abstract :
This paper reviews the evolution of Natural Language Processing (NLP) models, focusing on the distillation techniques used to create efficient and compact versions of large models. Traditional NLP models laid the foundation but had limitations in scalability and contextual understanding. Transformer models like BERT revolutionized NLP but required significant computational resources. This review examines TinyBERT, DistilBERT, MobileBERT, and MiniLM, which balance size and performance through knowledge distillation. These distilled models maintain high performance while being suitable for deployment on resource-constrained devices, making advanced NLP capabilities accessible in real-world applications.
Papers List
List of archived papers
Span-prediction of Unknown Values for Long-sequence Dialogue State Tracking
Marzieh Naghdi Dorabati - Reza Ramezani - Mohammad Ali Nematbakhsh
Classification of COVID-19 and Nodule in CT Images using Deep Convolutional Neural Network
Amirhossein Ghaemi - Seyyed Amir Mousavi mobarakeh - Habibollah Danyali - Kamran Kazemi
Enhancing EEG-based BCI Performances by Reducing Covariate Shift via Adaptive Multi-Domain Feature Extraction
Moein Radman - Reza Arghand - Nader Nariman-Zadeh - Ali Chaibakhsh
ExaASC: A General Target-Based Stance Detection Corpus in Arabic Language
Mohammad Mehdi Jaziriyan - Ahmad Akbari - Hamed Karbasi
Virtual Network Embedding based on Univariate Distribution Estimation
Arezoo Jahani
Virtual machine consolidation using SLA-aware genetic algorithm placement for data centers with non-stationary workloads
Hossein Monshizadeh Naeen
Introducing E4MT and LMBNC: Persian pre-processing utilities
Zakieh Shakeri - Mehran Ziabary - Behrooz Vedadian - Fatemeh Azadi - Saeed Torabzadeh - Arian Atefi
Adaptive Multi-Scale Attentional Network for Semantic Segmentation of Remote Sensing Images
Melika Zare - Sattar Hashemi
Improving ADHD Detection with Cost-Sensitive LightGBM
Behnam Yousefimehr - Mehdi Ghatee - Ali Heydari
SASIAF, An Scalable Accelerator For Seismic Imaging on Amazon AWS FPGAs
Mostafa Koraei - S.Omid Fatemi
more
Samin Hamayesh - Version 43.7.0