Assessing the difficulty of annotating medical data in crowdworking with help of experiments

Rother, Anne; Niemann, Uli; Hielscher, Tommy; Völzke, Henry; Ittermann, Till; Spiliopoulou, Myra

Please use this identifier to cite or link to this item: http://dx.doi.org/10.25673/78166

Full metadata record

DC Field	Value	Language
dc.contributor.author	Rother, Anne	-
dc.contributor.author	Niemann, Uli	-
dc.contributor.author	Hielscher, Tommy	-
dc.contributor.author	Völzke, Henry	-
dc.contributor.author	Ittermann, Till	-
dc.contributor.author	Spiliopoulou, Myra	-
dc.date.accessioned	2022-03-22T13:54:35Z	-
dc.date.available	2022-03-22T13:54:35Z	-
dc.date.issued	2021	-
dc.date.submitted	2021	-
dc.identifier.uri	https://opendata.uni-halle.de//handle/1981185920/80120	-
dc.identifier.uri	http://dx.doi.org/10.25673/78166	-
dc.description.abstract	Background As healthcare-related data proliferate, there is need to annotate them expertly for the purposes of personalized medicine. Crowdworking is an alternative to expensive expert labour. Annotation corresponds to diagnosis, so comparing unlabeled records to labeled ones seems more appropriate for crowdworkers without medical expertise. We modeled the comparison of a record to two other records as a triplet annotation task, and we conducted an experiment to investigate to what extend sensor-measured stress, task duration, uncertainty of the annotators and agreement among the annotators could predict annotation correctness. Materials and methods We conducted an annotation experiment on health data from a population-based study. The triplet annotation task was to decide whether an individual was more similar to a healthy one or to one with a given disorder. We used hepatic steatosis as example disorder, and described the individuals with 10 pre-selected characteristics related to this disorder. We recorded task duration, electro-dermal activity as stress indicator, and uncertainty as stated by the experiment participants (n = 29 non-experts and three experts) for 30 triplets. We built an Artificial Similarity-Based Annotator (ASBA) and compared its correctness and uncertainty to that of the experiment participants. Results We found no correlation between correctness and either of stated uncertainty, stress and task duration. Annotator agreement has not been predictive either. Notably, for some tasks, annotators agreed unanimously on an incorrect annotation. When controlling for Triplet ID, we identified significant correlations, indicating that correctness, stress levels and annotation duration depend on the task itself. Average correctness among the experiment participants was slightly lower than achieved by ASBA. Triplet annotation turned to be similarly difficult for experts as for non-experts. Conclusion Our lab experiment indicates that the task of triplet annotation must be prepared cautiously if delegated to crowdworkers. Neither certainty nor agreement among annotators should be assumed to imply correct annotation, because annotators may misjudge difficult tasks as easy and agree on incorrect annotations. Further research is needed to improve visualizations for complex tasks, to judiciously decide how much information to provide, Out-of-thelab experiments in crowdworker setting are needed to identify appropriate designs of a human-annotation task, and to assess under what circumstances non-human annotation should be preferred.	eng
dc.description.sponsorship	OVGU-Publikationsfonds 2021	-
dc.language.iso	eng	-
dc.relation.ispartof	https://journals.plos.org/plosone/	-
dc.rights.uri	https://creativecommons.org/licenses/by/4.0/	-
dc.subject	Crowdworking	eng
dc.subject	Healthcare-related data	eng
dc.subject	Annotation	eng
dc.subject.ddc	000	-
dc.title	Assessing the difficulty of annotating medical data in crowdworking with help of experiments	eng
dc.type	Article	-
dc.identifier.urn	urn:nbn:de:gbv:ma9:1-1981185920-801201	-
local.versionType	publishedVersion	-
local.bibliographicCitation.journaltitle	PLOS ONE	-
local.bibliographicCitation.volume	16	-
local.bibliographicCitation.issue	7	-
local.bibliographicCitation.pagestart	1	-
local.bibliographicCitation.pageend	26	-
local.bibliographicCitation.publishername	PLOS	-
local.bibliographicCitation.publisherplace	San Francisco, California, US	-
local.bibliographicCitation.doi	10.1371/journal.pone.0254764	-
local.openaccess	true	-
dc.identifier.ppn	1767653190	-
local.bibliographicCitation.year	2021	-
cbs.sru.importDate	2022-03-22T13:46:15Z	-
local.bibliographicCitation	Enthalten in PLOS ONE - San Francisco, California, US : PLOS, 2006	-
local.accessrights.dnb	free	-
Appears in Collections:	Fakultät für Informatik (OA)

Files in This Item:

File	Description	Size	Format
Rother et al._Assessing_2021.pdf	Zweitveröffentlichung	1.87 MB	Adobe PDF	View/Open

Show simple item record BibTeX EndNote