The proposed OCR algorithm to retrieve the text in the scanned document images. Here the text detection algorithm based on two machine learning classifiers: one allows generating candidate word regions and the other filters out non-text ones. The extract connected components (CCs) in images by using the maximally stable extremal region algorithm. In CC clustering adaboost classifiers are used to determine whether the region contains text or not. Then using binarization method, the gray image is converted into binary image. The binarization outcomes are subject to OCR and the corresponding result is evaluated with respect to character and word accuracy. As more and more text documents are scanned fast and accurate. Additional performance metrics of the percentage rates of broken and missed text, false alarms, background noise, character enlargement and merging. This effectiveness of the proposed method is also confirmed by tests carried on realistic document images. For proposed algorithm MATLAB version 13 software is used.