Skip to main content

Table 4 Comparison of legal question answering methods in terms of approach, dataset, key contributions, and accuracy

From: Exploring the state of the art in legal QA systems

Method

Approach

Dataset

Key contributions

Accuracy

Kim et al. [43]

QA for legal bar exams

247 questions

Hybrid method combining simple rules and unsupervised learning using deep linguistic features

61.13%

Taniguchi et al. [83]

Legal yes/no

QA system

COLIEE 2016 [45]

Case-role analysis and alignment-based approach for determining alignments

Shared first place in

Phase Two, achieved third place

in Phase Three

Sovrano et al. [79]

Extracting and making sense of complex information in legal documents

PIL Sovrano et al. [80]

KG extraction, taxonomy construction, legal ontology design pattern alignment, and KG question answering

Top5-recall

of 34.91%

McElvain et al. [60]

Non-factoid QA for legal domain

Large corpus of question-answer pairs

Trained on machine learning algorithms, gazetteer lookup taggers, statistical taggers, word embedding models

90% Answered at 3 for correct answers,

1.5% Answered at 3 for incorrect answers

Taniguchi et al. [84]

Legal QA system using FrameNet

COLIEE 2018 [45]

Semantic database based on FrameNet and predicate-argument structure analyzer

67% average accuracy

McElvain et al. [60]

Legal QA system using pre-trained models

22M documents classified to over 120K legal topics

Use of pre-trained models and fine-tuning on legal dataset

90% Answered at 3 for correct answers,

1.5% Answered at 3 for incorrect answers

Kim et al. [48]

Legal QA using CNN

COLIEE [45]

Exploiting legal information retrieval and textual entailment using CNN

63.87%

Kim et al. [48]

Legal information retrieval and QA

Japanese civil law articles and legal bar exams

Combination of tf-idf and SVM re-ranking model, lemmatization and dependency parsing

62.14% on the dry run dataset

and 55.71%, 55.79% for

Phase 2 and 3 respectively

Duong et al. [21]

Vietnamese QA system

Vietnam’s legal documents

Utilization of Vietnamese resources and tools, similarity-based model, and a combination of rule-based and machine learning methods

70% precision

Kim et al. [50]

Textual entailment for legal question answering

COLIEE 2014 training data for training, and the COLIEE 2015 test data for validation

Development of a legal question answering system using Siamese CNNs, preprocessing of data by removing stop words and performing stemming, use of a three-layer CNN to extract word features and a max pooling layer, use of dropout to prevent overfitting

64.25%

Martinez-Gil et al. [59]

Analyzing co-occurrence patterns in unstructured text corpora

Legal questions randomly selected from books from the Oxford University Press

A new method for the automatic answer of multiple choice questions in the legal domain, ability to reduce workload for professionals in the legal sector, ability to be extrapolated to other specific domains

65%

Hoshino et al. [37]

Predicate argument

Structure analysis

COLIEE 2018 [45]

Created legal term dictionary, synonym dictionary for predicates, person estimation feature, four types of question answering modules

70%

Alotaibi et al. [6]

Combination of retrieval-based and generative-based techniques with incorporation of prior knowledge sources such as previous questions, question categories, and Islamic Jurisprudential reference books

Custom dataset

Reduced workload on human experts by providing relevant and high-quality answers to aid in Muslims’ daily life decisions.

0.60 precision, 0.40 recall,

0.48 F1 and 0.037 for METEOR

Hoppe et al. [36]

Intelligent legal advisor

German legal documents

Semantic document retrieval and QA using state-of-the-art technologies in NLP, semantic search, and knowledge engineering

0.84 Recall

0.73 MAP