Revolutionizing Historical Document Digitization: LSTM-Enhanced OCR for Arabic Handwritten Manuscripts

  • Safiullah Faizullah
  • , Muhammad Sohaib Ayub
  • , Turki Alghamdi
  • , Toqeer Syed Ali
  • , Muhammad Asad Khan
  • , Emad Nabil

Research output: Contribution to a Journal (Peer & Non Peer)Articlepeer-review

1 Citation (Scopus)

Abstract

Optical Character Recognition (OCR) holds immense practical value in the realm of handwritten document analysis, given its widespread use in various human transactions. This scientific process enables the conversion of diverse documents or images into analyzable, editable, and searchable data. In this paper, we present a novel approach that combines transfer learning and Arabic OCR technology to digitize ancient handwritten scripts. Our method aims to preserve and enhance accessibility to extensive collections of historically significant materials, including fragile manuscripts and rare books. Through a comprehensive examination of the challenges encountered in digitizing Arabic handwritten texts, we propose a transfer learning-based framework that leverages pre-trained models to overcome the scarcity of labeled data for training OCR systems. The experimental results demonstrate a remarkable improvement in the recognition accuracy of Arabic handwritten texts, thereby offering a highly promising solution for the digitization of historical documents. Our work enables the digitization of large collections of ancient historical materials, including manuscripts and rare books characterized by delicate physical conditions. The proposed approach signifies a significant step towards preserving our cultural heritage and facilitating advanced research in historical document analysis.

Original languageEnglish
Pages (from-to)1185-1194
Number of pages10
JournalInternational Journal of Advanced Computer Science and Applications
Volume15
Issue number10
DOIs
Publication statusPublished - 2024
Externally publishedYes

Keywords

  • Arabic OCR
  • classification
  • convolutional neural Network
  • image processing
  • Optical character recognition
  • transfer learning

Fingerprint

Dive into the research topics of 'Revolutionizing Historical Document Digitization: LSTM-Enhanced OCR for Arabic Handwritten Manuscripts'. Together they form a unique fingerprint.

Cite this