Ref. | KG Specific Functionality | Knowledge Extraction Techniques | Type of KB | KG Resource(s) | KG Stats | Evaluation Measure(s) | Shortcoming(s) | |
---|---|---|---|---|---|---|---|---|
Entity-level | Relation-Level | |||||||
[100] | Generic biomedicine | Manual integration and mapping of entities and relationships | Schema- base | OMIM, DrugBank, PharmGKB, Therapeutic Target Database], SIDER, and HumanNet | #n: 7,603 #e: 500,958 | Hits@N and Downstream tasks | • The quality and integrity of the metadata cannot be fully assured. • The final version of the constructed graph does not have large-scale of entities compared with state-of-the-art KGs. • No discussion is provided on the adopted ontology. | |
[101] | Generic biomedicine | PubTatorFootnote 36 and manual annotation (EBC) | Stanford Dependency ParserFootnote 37 | Schema-free | Biomedical literature (Medline abstractsFootnote 38) | #n: N/A #e: 2,236,307 | Benchmark comparison | • Heavily dependent on the co-occurrence of paths to map scarcer paths to themes, • Lack of handling complex relations • There is a potential of a parser error, |
[102] | Translational biomedicine | Manually and automatically using SnakemakeFootnote 39 | Schema- base | 70 knowledge sources including SemMedDB, ChEMBL, etc. | #n: 6.4 m #e: 39.3 m | Benchmark comparison | • The automation process to construct the KG was not detailed. • The comparison with other KGs is not well discussed nor formulated. | |
[103] | Biomedical Causal Discovery | Manual and rule-based approach | Schema-free | PubMed | #n: N/A #e: N/A | Accuracy | The paper failed to extract implicit causality, The process to identify concepts and relationships between concepts is not detailed. | |
[82] | Marine Chinese medicine | Manual mapping between the ontology and the KG | Schema- base | Medical literature | #n: N/A #e: N/A | NA | • The paper inadequately described the construction and evaluation of the proposed KG. | |
[104] | Generic biomedicine | BioDBLinker | Automatic mapping | Schema- free | UniProtFootnote 40, REACTOMEFootnote 41, KEGGFootnote 42,DrugBank, SIDER, and d Human Protein Atlas (HPA)Footnote 43. | #n: N/A #e: N/A | Benchmark comparison | • Suffers from sparsity of data, • Train-test data leakage in case used without careful review |
[105] | Intestinal cells | Manually based on the conceptual model | Schema- base | PubMed | #n: 2443 #e: 160,253 | Case study | • Poor entity and relation extraction approaches. • Data source is static and limited to medical literature, yet medical facts of intestinal cells can be obtained from future experiments. | |
[112] | Microbiology | NER and NLP techniques | Schema- base | KG Hub – COVID19Footnote 44 | #n: 266,000 #e: 432,000 | N/A | • Poor discussion on mechanisms followed to construct and validate the KG | |
[113] | Gut microbiota | Manual annotation and mapping | Schema- base | Google Scholar and PubMed, UMLS, MeSH, SNOMED CT, and KEGG | #f: 31,268,998 | Case studies | • Poor extraction of entities and relations. • The correctness and completeness of extracted relations limit the semantic search’s precision and reliability. | |
[114] | Microbe-Disease Associations | Kindred entity and relation classifierFootnote 45 | Schema- free | Wikidata, UMLS, NCBI | #n: 9,832 #e: 21,905 | Hits@N | • KG can be expanded by means of a bacterial attribute mining tool, • Lacks a discussion on interactions between bacteria and antibiotics or viruses. | |
[115] | Coronavirus | Manual extraction and mapping | Schema- free | Analytical Graph (AG) and CORD-19Footnote 46 | #n: 588,820 #e: N/A | Case study | • Limited data sources, • Static KG | |
[116] | Coronavirus | BioBERT | Schema- free | PubMed and CORD-19 | #n: N/A #e: N/A | P, R, and F1-score | • KG can be expanded to other bio-medical datasets. • Further biomedical NLP models for NER, e.g., blueBERT can be attempted to verify the validy of the extracted knowledge. |