Skip to main content

Table 7 Event extraction and salient fact extraction

From: An analytical study of information extraction from unstructured and multidimensional big data

Task

Approach

Dataset

Results

Remarks

Term context understanding to deal with homonyms [31]

Semi-automated approach be combining automated content analysis and ANN

26,259 research articles from Web of science

Proposed solution evaluated with different sparsity parameters. Results showed different effects of different modeling terms on error rate

The proposed solution outperformed with manual classification in some instances that could not automatically be classified. Hence, improvement is required to automate the sifting process of homonyms context identification

IE from heterogeneous unstructured big data [32]

Unsupervised deep learning (multiple Kernel)

13 different datasets from UCI Machine Learning Repository

Performance of the proposed system was better in speed from other competitors and same in accuracy

Accuracy of heterogeneous data can be improved with unsupervised learning but advancement in approach is required to handle the dynamicity of such data

Deep semantic IE for big data mining from geoscience data [33]

convolutional neural networks (CNN) for classification and TF-IDF for word statistics

Multivariate and heterogeneous data of 16,098 PDN, 130 LAN’s

classification accuracy of 99.9% and 99.8% at the sentence and paragraph levels, respectively

Insufficient comprehensiveness, poor correlation and inconsistent formats are problems of heterogeneous data

Open domain event extraction [34]

Schema discovery based on probabilistic generative models i.e. LinkLDA

Set of events generated and extracted from Twitter

The difference between proposed and related work is, it can handle complex queries and structured data browsing

The sparsity of unstructured big data can decrease the performance and scalability of solution. So, these are important factors to investigate the effectiveness of approach

Biomedical Event extraction [35]

Syntactic and semantic features to identify event trigger + Phrase Structure Tree

BioNLP-ST 2013

The solution was evaluated and shown 52.23% precision, 26.38% recall, and 35.06% F1-score

The proposed approach uses ML features that inherits the limitations of the ML feature based techniques