Skip to main content

Table 5 Relation extraction

From: An analytical study of information extraction from unstructured and multidimensional big data

 

Technique

Purpose

Domain

Dataset

Results

P%

R%

F%

[17]

CRF

To generate relationship knowledge base and annotation

Lexical, POS and semantic features used

Chinese encyclopedia

52,975 web pages

Model trained for 9 attributes, accuracy of global training is higher than the local whereas recall rate was low

[18]

Knowledge oriented CNN with clustering using word filters (WordNet)

To overcome the limitations of RBM and LBM and to reduce the dimensionality

Text

3 datasets were used: SemEval-2010 task 8 with 10,717 annotated samples, Causal-TimeBank dataset, Event StoryLine dataset

With max clustering achieved 91.34, 76.21, 81.84% macro averaged F1 on SemEval, Casual-TB, Event-SL resp., whereas, with average clustering, it achieved 91.20, 75.43, 81.96% F1 resp.

[19]

Pattern-based method to build info network

To extract large-scale treatment drug-disease pairs and inducement drug-disease pairs

Medical literature for drug repurposing

27M abstracts and titles from PubMed

Algorithm has shown high precision but low recall

[20]

Weakly supervised method without man-made annotation and SVM to train model

To reduce the manual annotation effort and expand the relation types using semantic and syntactic features

News text

Baidu encyclopedia, 50,000 entry pages of 10 GB size

83.61

82.63

83.12

Results proved that entity ambiguity, and poor universality affect the results

[21]

Multi-class SVM and syntactic model development

To detect semantic relation, model architecture with preprocessing phase to build feature vector using lexical, semantic and syntactic features, training phase and RE phase

News Text

ReACE

80.18

70.89

75.25