openlib

Autor
- Chegini, Mohammad
- Bernard, Jürgen
- Cui, Jian
- Chegini, Fatemeh
- Sourin, Alexei
- Andrews, Keith
- Schreck, Tobias
TitelInteractive Visual Labelling versus Active Learning: An Experimental Comparison
Datei
- [4.41 MB]
DOI10.1631/FITEE.1900549
Persistent Identifier
- https://doi.org/10.1631/FITEE.1900549
Erschienen inFrontiers of information technology & electronic engineering
Band21
Erscheinungsjahr2020
Heft4
Seiten524-535
LicenceCC BY
ISSN2095-9230
Zugriffsrechte
Download Statistik1942
Peer ReviewJa
AbstractMethods from supervised machine learning allow the classification of new data automatically and aretremendously helpful for data analysis. The quality of supervised maching learning depends not only on the typeof algorithm used, but also on the quality of the labelled dataset used to train the classifier. Labelling instancesin a training dataset is often done manually relying on selections and annotations by expert analysts, and is oftena tedious and time-consuming process. Active learning algorithms can automatically determine a subset of datainstances for which labels would provide useful input to the learning process. Interactive visual labelling techniquesare a promising alternative, providing effective visual overviews from which an analyst can simultaneously exploredata records and select items to a label. By putting the analyst in the loop, higher accuracy can be achieved inthe resulting classifier. While initial results of interactive visual labelling techniques are promising in the sense thatuser labelling can improve supervised learning, many aspects of these techniques are still largely unexplored. Thispaper presents a study conducted using the mVis tool to compare three interactive visualisations, similarity map,scatterplot matrix (SPLOM), and parallel coordinates, with each other and with active learning for the purpose oflabelling a multivariate dataset. The results show that all three interactive visual labelling techniques surpass activelearning algorithms in terms of classifier accuracy, and that users subjectively prefer the similarity map over SPLOMand parallel coordinates for labelling. Users also employ different labelling strategies depending on the visualisation used.

Kontakt

Related Links

TU Graz

openlib.tugraz.at