Skip to main content

Table 10 Text recognition from images

From: An analytical study of information extraction from unstructured and multidimensional big data

  Purpose Technique Dataset Results Limitations/benefits
[51] To possess high learning capacity
To handle high dimensional data
CNN based OCR Scanned Sanskrit document images (11,230) Proposed approach outperform than existing. Accuracy was 93.32% Training time as 1 h with GPU
[52] To automatic recognition of handwritten text from images CNN based OCR MNIST 98.11% accuracy rate DL should apply on large datasets
[53] To compare the results of proposed DBN and CNN ECR Unsupervised feature learning with DBN HACDB dataset containing 6600 images Experiments shown 3.64% and 14.71% for DBN and CNN resp. DBN with unsupervised feature learning outperform CNN for high dimensional data
[54] To develop end to end mechanism for Scene TR FANet using resnet as encoder and seq2seq attention mechanism as decoder 5000 authentic seal dataset, 3660 real time train ticket dataset Although, proposed approach could not achieve outperforming results but angular and horizontal TR was improved Full attention mechanism was proposed to replace detect, slice, and recognize process with end to end recognition
Ineffective for long text recognition
[55] To recognize text from handwritten and printed text images TMIXT: tessetact for machine printed text recognition and LSTM for handwritten text recognition IAM handwriting database Achieved 80% average transcription accuracy Heavy preprocessing is required for combined text recognition with proposed solution
[56] To recognize text using attention mechanism CAN (Convolutional Attention Network), 2D CNN as encoder and one dimensional CNN decoder Street View text SVT, IIIT5K, and ICDAR 03, ICDAR 13 dataset The proposed model performed better than others on SVT and ICDAR 03 datasets Improvement in proposed method is required for promising results
[57] Semantic based text recognition to extract useful information from images CNN and bidirectional LSTM where convolutional part uses VGG and recurrent part uses bidirectional LSTM Interior Design Dataset with 7708 images Achieved 90% accuracy in word recognition Generality improved but the text recognition from protest images is relatively an easy task. Evaluation of the system with complex and diverse datasets should be promising