Method | Approach | Dataset | Key contributions | Accuracy |
---|---|---|---|---|
Kim et al. [43] | QA for legal bar exams | 247 questions | Hybrid method combining simple rules and unsupervised learning using deep linguistic features | 61.13% |
Taniguchi et al. [83] | Legal yes/no QA system | COLIEE 2016 [45] | Case-role analysis and alignment-based approach for determining alignments | Shared first place in Phase Two, achieved third place in Phase Three |
Sovrano et al. [79] | Extracting and making sense of complex information in legal documents | PIL Sovrano et al. [80] | KG extraction, taxonomy construction, legal ontology design pattern alignment, and KG question answering | Top5-recall of 34.91% |
McElvain et al. [60] | Non-factoid QA for legal domain | Large corpus of question-answer pairs | Trained on machine learning algorithms, gazetteer lookup taggers, statistical taggers, word embedding models | 90% Answered at 3 for correct answers, 1.5% Answered at 3 for incorrect answers |
Taniguchi et al. [84] | Legal QA system using FrameNet | COLIEE 2018 [45] | Semantic database based on FrameNet and predicate-argument structure analyzer | 67% average accuracy |
McElvain et al. [60] | Legal QA system using pre-trained models | 22M documents classified to over 120K legal topics | Use of pre-trained models and fine-tuning on legal dataset | 90% Answered at 3 for correct answers, 1.5% Answered at 3 for incorrect answers |
Kim et al. [48] | Legal QA using CNN | COLIEE [45] | Exploiting legal information retrieval and textual entailment using CNN | 63.87% |
Kim et al. [48] | Legal information retrieval and QA | Japanese civil law articles and legal bar exams | Combination of tf-idf and SVM re-ranking model, lemmatization and dependency parsing | 62.14% on the dry run dataset and 55.71%, 55.79% for Phase 2 and 3 respectively |
Duong et al. [21] | Vietnamese QA system | Vietnam’s legal documents | Utilization of Vietnamese resources and tools, similarity-based model, and a combination of rule-based and machine learning methods | 70% precision |
Kim et al. [50] | Textual entailment for legal question answering | COLIEE 2014 training data for training, and the COLIEE 2015 test data for validation | Development of a legal question answering system using Siamese CNNs, preprocessing of data by removing stop words and performing stemming, use of a three-layer CNN to extract word features and a max pooling layer, use of dropout to prevent overfitting | 64.25% |
Martinez-Gil et al. [59] | Analyzing co-occurrence patterns in unstructured text corpora | Legal questions randomly selected from books from the Oxford University Press | A new method for the automatic answer of multiple choice questions in the legal domain, ability to reduce workload for professionals in the legal sector, ability to be extrapolated to other specific domains | 65% |
Hoshino et al. [37] | Predicate argument Structure analysis | COLIEE 2018 [45] | Created legal term dictionary, synonym dictionary for predicates, person estimation feature, four types of question answering modules | 70% |
Alotaibi et al. [6] | Combination of retrieval-based and generative-based techniques with incorporation of prior knowledge sources such as previous questions, question categories, and Islamic Jurisprudential reference books | Custom dataset | Reduced workload on human experts by providing relevant and high-quality answers to aid in Muslims’ daily life decisions. | 0.60 precision, 0.40 recall, 0.48 F1 and 0.037 for METEOR |
Hoppe et al. [36] | Intelligent legal advisor | German legal documents | Semantic document retrieval and QA using state-of-the-art technologies in NLP, semantic search, and knowledge engineering | 0.84 Recall 0.73 MAP |