Please use this identifier to cite or link to this item:
http://dx.doi.org/10.25673/86235
Full metadata record
DC Field | Value | Language |
---|---|---|
dc.contributor.author | Heumüller, Robert | - |
dc.contributor.author | Nielebock, Sebastian | - |
dc.contributor.author | Ortmeier, Frank | - |
dc.date.accessioned | 2022-06-16T11:51:11Z | - |
dc.date.available | 2022-06-16T11:51:11Z | - |
dc.date.issued | 2021 | - |
dc.date.submitted | 2021 | - |
dc.identifier.uri | https://opendata.uni-halle.de//handle/1981185920/88187 | - |
dc.identifier.uri | http://dx.doi.org/10.25673/86235 | - |
dc.description.abstract | Modern code review (MCR) processes are prevalent in most organizations that develop software due to benefits in quality assurance and knowledge transfer. With the rise of collaborative software development platforms like GitHub and Bitbucket, today, millions of projects share not only their code but also their review data. Although researchers have tried to exploit this data for more than a decade, most of that knowledge remains a buried treasure. A crucial catalyst for many advances in deep learning, however, is the accessibility of large-scale standard datasets for different learning tasks. This paper presents the ETCR (Exploit Those Code Reviews!) infrastructure for mining MCR datasets from any GitHub project practicing pull-request-based development. We demonstrate its effectiveness with ETCR-Elasticsearch, a dataset of >231𝑘 review comments for >47𝑘 Java file revisions in >40𝑘 pull-requests from the Elasticsearch project. ETCR is designed with the challenge of deep learning in mind. Compared to previous datasets, ETCR datasets include all information for linking review comments to nodes in the respective program’s Abstract Syntax Tree. | eng |
dc.description.sponsorship | Transformationsvertrag | - |
dc.language.iso | eng | - |
dc.relation.ispartof | 10.1145/3468264 | - |
dc.rights.uri | https://creativecommons.org/licenses/by/4.0/ | - |
dc.subject | Computing methodologies | eng |
dc.subject | General and reference | eng |
dc.subject | Experimentation | eng |
dc.subject | Software and its engineering | eng |
dc.subject | Collaboration in software development. | - |
dc.subject.ddc | 000 | - |
dc.title | Exploit those code reviews! : bigger data for deeper learning | eng |
dc.type | Conference Object | - |
dc.identifier.urn | urn:nbn:de:gbv:ma9:1-1981185920-881876 | - |
local.versionType | publishedVersion | - |
local.openaccess | true | - |
dc.identifier.ppn | 1775624145 | - |
local.bibliographicCitation.year | 2021 | - |
cbs.sru.importDate | 2022-06-16T11:46:56Z | - |
local.bibliographicCitation | Enthalten in Proceedings of the 29th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering - New York,NY,United States : Association for Computing Machinery, 2021 | - |
local.accessrights.dnb | free | - |
Appears in Collections: | Fakultät für Informatik (OA) |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
Heumueller et al._Exploit those_2021.pdf | Zweitveröffentlichung | 693.02 kB | Adobe PDF | View/Open |