|
Twitter
|
Facebook
|
Google+
|
VKontakte
|
LinkedIn
|
 
 
International Journal of Innovation and Scientific Research
ISSN: 2351-8014
 
 
Monday 23 July 2018

About IJISR

News

Submission

Downloads

Archives

Custom Search

Contact

Connect with IJISR

   
 
 
 

Degraded Document Image Binarization Using Optical Character Recognition


Volume 22, Issue 2, April 2016, Pages 304–311

 Degraded Document Image Binarization Using Optical Character Recognition

M. Manimaraboopathy, M. Anto Bennet, M. Kalpana, S. Premalatha, and G. Gayathri

Original language: English

Received 31 March 2016

Copyright © 2016 ISSR Journals. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Abstract


The proposed OCR algorithm to retrieve the text in the scanned document images. Here the text detection algorithm based on two machine learning classifiers: one allows generating candidate word regions and the other filters out non-text ones. The extract connected components (CCs) in images by using the maximally stable extremal region algorithm. In CC clustering adaboost classifiers are used to determine whether the region contains text or not. Then using binarization method, the gray image is converted into binary image. The binarization outcomes are subject to OCR and the corresponding result is evaluated with respect to character and word accuracy. As more and more text documents are scanned fast and accurate. Additional performance metrics of the percentage rates of broken and missed text, false alarms, background noise, character enlargement and merging. This effectiveness of the proposed method is also confirmed by tests carried on realistic document images. For proposed algorithm MATLAB version 13 software is used.

Author Keywords: Maximally Stable Extremal Regions(MSER), optical character recognition (OCR).


How to Cite this Article


M. Manimaraboopathy, M. Anto Bennet, M. Kalpana, S. Premalatha, and G. Gayathri, “Degraded Document Image Binarization Using Optical Character Recognition,” International Journal of Innovation and Scientific Research, vol. 22, no. 2, pp. 304–311, April 2016.