0% Complete
Home
/
15th International Conference on Computer and Knowledge Engineering
LLM-Driven AutoML for Cross-Lingual Handwritten OCR: Closed-Loop Neural Architecture Search with GPT-5, GPT-4o, and Claude Sonnet 4
Authors :
Mobina Kashaniyan
1
Amirhossein Ghassemi
2
Nasser Mozayani
3
1- Iran University of Science and Technology
2- Iran University of Science and Technology
3- Iran University of Science and Technology
Keywords :
large language models،neural architecture search،handwritten text recognition،multilingual OCR،automation،model discovery
Abstract :
Handwritten text recognition across diverse scripts presents an enduring challenge in machine learning, as each language and writing system introduces unique visual complexities and demands. Traditional approaches have depended on expertguided model design and extensive preprocessing, which make it difficult to scale and adapt to new scripts efficiently. In this work, we introduce a pipeline that is fully automatic and cross lingual, using large language models, GPT 5, GPT 4o and Claude Sonnate 4, to independently generate, evaluate, and refine neural network architectures for handwritten optical character recognition. This process requires no manual intervention, domain specific preprocessing, or human selection of models, resulting in a complete end to end automated system. We apply this approach to Arabic, English, and Persian scripts, each representing distinct character shapes and writing traditions, and conduct thirty independent trials for every language. The pipeline consistently discovers efficient models with high test accuracy, achieving average scores above ninety three percent, while also maintaining inference speeds that meet the needs of real time applications. Notably, the system is able to automatically explore a wide range of neural architectures and adaptively select designs that fit the unique requirements of each script, without any explicit guidance from human experts. These results show that large language models can move beyond language processing and act as independent designers for machine learning systems. This enables a scalable, script agnostic, and fully automatic solution for multilingual handwritten text recognition, opening the door to rapid and adaptable deployment of OCR technology across many languages and domains.
Papers List
List of archived papers
Distilled BERT Model In Natural Language Processing
Yazdan Zandiye Vakili - Avisa Fallah - Hedieh Sajedi
Delay Optimization of a Federated Learning-based UAV-aided IoT network
Hossein Mohammadi Firouzjaei - Javad Zeraatkar Moghaddam - Mehrdad Ardebilipour
Artificial Intelligence applications addressing different aspects of the Covid-19 crisis and key technological solutions for future epidemics control
Nadia Khalili - Hojatollah Hamidi
A Simple Low Cost Approach to Detect Hand Gesture Based on Software Event Camera Emulation
Ali Sabet Akbarzadeh - Abedin Vahedian
Enhancing Lighter Neural Network Performance with Layer-wise Knowledge Distillation and Selective Pixel Attention
Siavash Zaravashan - Sajjad Torabi - Hesam Zaravashan
Systematic review on AI techniques in detection and navigation of agricultural machines and robots
Afsaneh Soleimani - Mohammad Boghrati - Hossein Damavandi
Efficient Vision Transformer for Accurate Traffic Sign Detection
Javad Mirzapour Kaleybar - Hooman Khaloo - Avaz Naghipour
Impossible differential and zero-correlatin linear cryptanalysis of Marx, Marx2, Chaskey andSpeck32
Mahshid Saberi - Nasour Bagheri - Sadegh Sadeghi
Vision-Based Obstacle Avoidance in Drone Navigation using Deep Reinforcement Learning
Pooyan Rahmanzadeh Gervi - Ahad Harati - Sayed Kamaledin Ghiasi-Shirazi
Crack Segmentation in Civil Structure Images Using a Deep Learning Based Multi-Classifier System
Mohammadreza Asadi - Seyedeh Sogand Hashemi - Mohammad Taghi Sadeghi
more
Samin Hamayesh - Version 43.7.0