Please use this identifier to cite or link to this item:
http://dx.doi.org/10.25673/86231
Title: | Legal norm retrieval with variations of the bert model combined with TF-IDF vectorization |
Author(s): | Wehnert, Sabine Sudhi, Viju Dureja, Shipra Kutty, Libin Shahania, Saijal De Luca, Ernesto William |
Issue Date: | 2021 |
Type: | Conference object |
Language: | English |
URN: | urn:nbn:de:gbv:ma9:1-1981185920-881832 |
Subjects: | Applied computing Law Information systems Document representation Computing methodologies |
Abstract: | In this work, we examine variations of the BERT model on the statute law retrieval task of the COLIEE competition. This includes approaches to leverage BERT’s contextual word embeddings, finetuning the model, combining it with TF-IDF vectorization, adding external knowledge to the statutes and data augmentation. Our ensemble of Sentence-BERT with two different TF-IDF representations and document enrichment exhibits the best performance on this task regarding the F2 score. This is followed by a fine-tuned LEGAL-BERT with TF-IDF and data augmentation and our third approach with the BERTScore. As a result, we show that there are significant differences between the chosen BERT approaches and discuss several design decisions in the context of statute law retrieval. |
URI: | https://opendata.uni-halle.de//handle/1981185920/88183 http://dx.doi.org/10.25673/86231 |
Open Access: | Open access publication |
License: | (CC BY-SA 4.0) Creative Commons Attribution ShareAlike 4.0 |
Sponsor/Funder: | Transformationsvertrag |
Appears in Collections: | Fakultät für Informatik (OA) |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
Wehnert et al._Legal norm_2021.pdf | Zweitveröffentlichung | 671.85 kB | Adobe PDF | View/Open |