version française rss feed
HAL : inria-00435501, version 1

Voir la fiche détaillée  BibTeX,EndNote,...
International Journal of Pattern Recognition and Artificial Intelligence 23, 8 (2009) 1599-1632
Automation of Indian Postal Documents written in Bangla and English
Szilárd Vajda1, Kaushik Roy2, 3, 4, Umapada Pal3, Bidyut B Chaudhuri3, Abdel Belaïd1

In this paper, we present a system towards Indian postal automation based on pin-code and city name recognition. Here, at first, using Run Length Smoothing Approach (RLSA), non-text blocks (postal stamp, postal seal, etc.) are detected and using positional information Destination Address Block (DAB) is identified from postal documents. Next, lines and words of the DAB are segmented. In India, the address part of a postal document may be written by combination of two scripts: Latin (English) and a local (State/region) script. It is very difficult to identify the script by which pin-code part is written. To overcome this problem on pin-code part, we have used two-stage artificial neural network based general scheme to recognize pin-code numbers written in any of the two scripts. To identify the script by which a word/city name is written, we propose a water reservoir concept based feature. For recognition of city names, we propose an NSHP-HMM (Non- Symmetric Half Plane-Hidden Markov Model) based technique. At present, the accuracy of the proposed digit numeral recognition module is 93.14% while that of city name recognition scheme is 86.44%.
2 :  Dept. of ECE
3 :  CVPR - Computer Vision and Pattern Recognition Unit
4 :  IPNL - Institut de Physique Nucléaire de Lyon
Informatique/Traitement du texte et du document
Liste des fichiers attachés à ce document :
ijprai-final-12-11-08.pdf(752.2 KB)