0% Complete
Home
/
15th International Conference on Computer and Knowledge Engineering
LLM-Driven AutoML for Cross-Lingual Handwritten OCR: Closed-Loop Neural Architecture Search with GPT-5, GPT-4o, and Claude Sonnet 4
Authors :
Mobina Kashaniyan
1
Amirhossein Ghassemi
2
Nasser Mozayani
3
1- Iran University of Science and Technology
2- Iran University of Science and Technology
3- Iran University of Science and Technology
Keywords :
large language models،neural architecture search،handwritten text recognition،multilingual OCR،automation،model discovery
Abstract :
Handwritten text recognition across diverse scripts presents an enduring challenge in machine learning, as each language and writing system introduces unique visual complexities and demands. Traditional approaches have depended on expertguided model design and extensive preprocessing, which make it difficult to scale and adapt to new scripts efficiently. In this work, we introduce a pipeline that is fully automatic and cross lingual, using large language models, GPT 5, GPT 4o and Claude Sonnate 4, to independently generate, evaluate, and refine neural network architectures for handwritten optical character recognition. This process requires no manual intervention, domain specific preprocessing, or human selection of models, resulting in a complete end to end automated system. We apply this approach to Arabic, English, and Persian scripts, each representing distinct character shapes and writing traditions, and conduct thirty independent trials for every language. The pipeline consistently discovers efficient models with high test accuracy, achieving average scores above ninety three percent, while also maintaining inference speeds that meet the needs of real time applications. Notably, the system is able to automatically explore a wide range of neural architectures and adaptively select designs that fit the unique requirements of each script, without any explicit guidance from human experts. These results show that large language models can move beyond language processing and act as independent designers for machine learning systems. This enables a scalable, script agnostic, and fully automatic solution for multilingual handwritten text recognition, opening the door to rapid and adaptable deployment of OCR technology across many languages and domains.
Papers List
List of archived papers
YOLOatt-Med: YOLO-Based Attention Mechanism for Medical Image Classification
Fatemeh Naserizadeh - Erfan Akbarnezhad Sany - Parsa Sinichi - Seyyed Abed Hosseini
Prediction of rTMS Treatment Response in Depression Using a Frequency-Based EEG Biomarker
Ali Asadi Zeidabadi - Saeid Rashidi
A routing method with the approach of reducing energy consumption in WSNs with the Jellyfish Search (JS) optimizer algorithm and unequal clustering
Ehsan Gholami - Javad Hamidzadeh
Hardware-Efficient Pruned CNN Optimized by Neural Architecture Search and Genetic Algorithm for Diabetic Retinopathy Detection on STM32F746
Omid Askari Haddad - Sara Ershadi-Nasab
MultiPath ViT OCR: A Lightweight Visual Transformer-based License Plate Optical Character Recognition
Alireza Azadbakht - Saeed Reza Kheradpisheh - Hadi Farahani
Reversible Data Insertion in Encryption Domain Based on Reduced Quad Difference Expansion
Alireza Ghaemi - Mohammad Zare Ehteshami - Amirhossein Ghaemi
Intelligent Interpretation of Frequency Response Signatures to Diagnose Radial Deformation in Transformer Windings Using Artificial Neural Network
Reza Behkam - Hossein Karami - Mehdi Salay Naderi - Gevork B. Gharehpetian
Designing a High Perfomance and High Profit P2P Energy Trading System Using a Consortium Blockchain Network
Poonia Taheri Makhsoos - Behnam Bahrak - Fattaneh Taghiyareh
An Adaptive Budget and Deadline-aware Algorithm for Scheduling Workflows Ensemble in IaaS Clouds
Negin Shafinezhad - Hamid Abrishami - Saeid Abrishami
Hate Sentiment Recognition System For Persian Language
Pegah Shams jey - Arash Hemmati - Ramin Toosi - Mohammad ali Akhaee
more
Samin Hamayesh - Version 43.7.0