Skip to main content

Table 10 Text recognition from images

From: An analytical study of information extraction from unstructured and multidimensional big data

 

Purpose

Technique

Dataset

Results

Limitations/benefits

[51]

To possess high learning capacity

To handle high dimensional data

CNN based OCR

Scanned Sanskrit document images (11,230)

Proposed approach outperform than existing. Accuracy was 93.32%

Training time as 1 h with GPU

[52]

To automatic recognition of handwritten text from images

CNN based OCR

MNIST

98.11% accuracy rate

DL should apply on large datasets

[53]

To compare the results of proposed DBN and CNN ECR

Unsupervised feature learning with DBN

HACDB dataset containing 6600 images

Experiments shown 3.64% and 14.71% for DBN and CNN resp.

DBN with unsupervised feature learning outperform CNN for high dimensional data

[54]

To develop end to end mechanism for Scene TR

FANet using resnet as encoder and seq2seq attention mechanism as decoder

5000 authentic seal dataset, 3660 real time train ticket dataset

Although, proposed approach could not achieve outperforming results but angular and horizontal TR was improved

Full attention mechanism was proposed to replace detect, slice, and recognize process with end to end recognition

Ineffective for long text recognition

[55]

To recognize text from handwritten and printed text images

TMIXT: tessetact for machine printed text recognition and LSTM for handwritten text recognition

IAM handwriting database

Achieved 80% average transcription accuracy

Heavy preprocessing is required for combined text recognition with proposed solution

[56]

To recognize text using attention mechanism

CAN (Convolutional Attention Network), 2D CNN as encoder and one dimensional CNN decoder

Street View text SVT, IIIT5K, and ICDAR 03, ICDAR 13 dataset

The proposed model performed better than others on SVT and ICDAR 03 datasets

Improvement in proposed method is required for promising results

[57]

Semantic based text recognition to extract useful information from images

CNN and bidirectional LSTM where convolutional part uses VGG and recurrent part uses bidirectional LSTM

Interior Design Dataset with 7708 images

Achieved 90% accuracy in word recognition

Generality improved but the text recognition from protest images is relatively an easy task. Evaluation of the system with complex and diverse datasets should be promising