0% Complete
Home
/
15th International Conference on Computer and Knowledge Engineering
LLM-Driven AutoML for Cross-Lingual Handwritten OCR: Closed-Loop Neural Architecture Search with GPT-5, GPT-4o, and Claude Sonnet 4
Authors :
Mobina Kashaniyan
1
Amirhossein Ghassemi
2
Nasser Mozayani
3
1- Iran University of Science and Technology
2- Iran University of Science and Technology
3- Iran University of Science and Technology
Keywords :
large language models،neural architecture search،handwritten text recognition،multilingual OCR،automation،model discovery
Abstract :
Handwritten text recognition across diverse scripts presents an enduring challenge in machine learning, as each language and writing system introduces unique visual complexities and demands. Traditional approaches have depended on expertguided model design and extensive preprocessing, which make it difficult to scale and adapt to new scripts efficiently. In this work, we introduce a pipeline that is fully automatic and cross lingual, using large language models, GPT 5, GPT 4o and Claude Sonnate 4, to independently generate, evaluate, and refine neural network architectures for handwritten optical character recognition. This process requires no manual intervention, domain specific preprocessing, or human selection of models, resulting in a complete end to end automated system. We apply this approach to Arabic, English, and Persian scripts, each representing distinct character shapes and writing traditions, and conduct thirty independent trials for every language. The pipeline consistently discovers efficient models with high test accuracy, achieving average scores above ninety three percent, while also maintaining inference speeds that meet the needs of real time applications. Notably, the system is able to automatically explore a wide range of neural architectures and adaptively select designs that fit the unique requirements of each script, without any explicit guidance from human experts. These results show that large language models can move beyond language processing and act as independent designers for machine learning systems. This enables a scalable, script agnostic, and fully automatic solution for multilingual handwritten text recognition, opening the door to rapid and adaptable deployment of OCR technology across many languages and domains.
Papers List
List of archived papers
Multi Model CNN Based Gas Meter Characters Recognition
Sanaz Tarhib - Jafar Tanha - Soodabeh Imanzadeh - Sahar Hassanzadeh Mostafaei
A Vision-Based Method for Human Activity Recognition Using Local Binary Pattern
Babak Goodarzi - Reza Javidan - Mohammad Sadegh Rezaei
A New Time Series Approach in Churn Prediction with Discriminatory Intervals
Hedieh Ahmadi - Seyed Mohammad Hossein Hasheminejad
Bridging Knowledge and Language Models in Healthcare: A RAG Survey
Seyedali Hasanzadeh - Fahimeh Ghasemian - Elham Shabaninia
REMA: Reinforced Exponential Moving Average for Real-Time Anomaly Detection in Sensor Data
Mohammad Hossein Jafari Naeimi - Ali Norouzi - Athena Abdi
Impossible differential and zero-correlatin linear cryptanalysis of Marx, Marx2, Chaskey andSpeck32
Mahshid Saberi - Nasour Bagheri - Sadegh Sadeghi
A Stacking Ensemble Framework for Ransomware Detection on the Bitcoin Blockchain Using Transaction Graph Analytics
Mohammad Mobin Teymourpour - Parsa Hedayatnia - Mohammad Allahbakhsh - Haleh Amintoosi
A Smart Electrochemical Biosensor for Arsenic Detection in Water
Keyvan Asefpour Vakilian
Towards Transparent and Accurate Story Point Estimation via Interpretable BERT-based Modeling
Seyed Emad Baradaran Hosseini - Maryam Khodabakhsh - Alireza Tajary - Seyedehfatemeh Karimi
PowerLinear Activation Functions with application to the first layer of CNNs
Kamyar Nasiri - Kamaledin Ghiasi-Shirazi
more
Samin Hamayesh - Version 43.7.0