Skip to main content

Table 5 Relation extraction

From: An analytical study of information extraction from unstructured and multidimensional big data

  Technique Purpose Domain Dataset Results
P% R% F%
[17] CRF To generate relationship knowledge base and annotation
Lexical, POS and semantic features used
Chinese encyclopedia 52,975 web pages Model trained for 9 attributes, accuracy of global training is higher than the local whereas recall rate was low
[18] Knowledge oriented CNN with clustering using word filters (WordNet) To overcome the limitations of RBM and LBM and to reduce the dimensionality Text 3 datasets were used: SemEval-2010 task 8 with 10,717 annotated samples, Causal-TimeBank dataset, Event StoryLine dataset With max clustering achieved 91.34, 76.21, 81.84% macro averaged F1 on SemEval, Casual-TB, Event-SL resp., whereas, with average clustering, it achieved 91.20, 75.43, 81.96% F1 resp.
[19] Pattern-based method to build info network To extract large-scale treatment drug-disease pairs and inducement drug-disease pairs Medical literature for drug repurposing 27M abstracts and titles from PubMed Algorithm has shown high precision but low recall
[20] Weakly supervised method without man-made annotation and SVM to train model To reduce the manual annotation effort and expand the relation types using semantic and syntactic features News text Baidu encyclopedia, 50,000 entry pages of 10 GB size 83.61 82.63 83.12
Results proved that entity ambiguity, and poor universality affect the results
[21] Multi-class SVM and syntactic model development To detect semantic relation, model architecture with preprocessing phase to build feature vector using lexical, semantic and syntactic features, training phase and RE phase News Text ReACE 80.18 70.89 75.25