Please use this identifier to cite or link to this item: http://dx.doi.org/10.25673/86231
Title: Legal norm retrieval with variations of the bert model combined with TF-IDF vectorization
Author(s): Wehnert, Sabine
Sudhi, Viju
Dureja, Shipra
Kutty, Libin
Shahania, Saijal
De Luca, Ernesto WilliamLook up in the Integrated Authority File of the German National Library
Issue Date: 2021
Type: Conference object
Language: English
URN: urn:nbn:de:gbv:ma9:1-1981185920-881832
Subjects: Applied computing
Law
Information systems
Document representation
Computing methodologies
Abstract: In this work, we examine variations of the BERT model on the statute law retrieval task of the COLIEE competition. This includes approaches to leverage BERT’s contextual word embeddings, finetuning the model, combining it with TF-IDF vectorization, adding external knowledge to the statutes and data augmentation. Our ensemble of Sentence-BERT with two different TF-IDF representations and document enrichment exhibits the best performance on this task regarding the F2 score. This is followed by a fine-tuned LEGAL-BERT with TF-IDF and data augmentation and our third approach with the BERTScore. As a result, we show that there are significant differences between the chosen BERT approaches and discuss several design decisions in the context of statute law retrieval.
URI: https://opendata.uni-halle.de//handle/1981185920/88183
http://dx.doi.org/10.25673/86231
Open Access: Open access publication
License: (CC BY-SA 4.0) Creative Commons Attribution ShareAlike 4.0(CC BY-SA 4.0) Creative Commons Attribution ShareAlike 4.0
Sponsor/Funder: Transformationsvertrag
Appears in Collections:Fakultät für Informatik (OA)

Files in This Item:
File Description SizeFormat 
Wehnert et al._Legal norm_2021.pdfZweitveröffentlichung671.85 kBAdobe PDFThumbnail
View/Open