Treffer: cognitivefactory/interactive-clustering-comparative-study

Title:
cognitivefactory/interactive-clustering-comparative-study
Publisher Information:
Zenodo
Publication Year:
2023
Collection:
Zenodo
Document Type:
E-Ressource software
Language:
unknown
DOI:
10.5281/zenodo.10119439
Rights:
CeCILL-C Free Software License Agreement ; cecill-c ; http://www.cecill.info/licences/Licence_CeCILL-C_V1-en.html
Accession Number:
edsbas.8B107970
Database:
BASE

Weitere Informationen

Interactive Clustering : Comparative Studies Several comparative studies of cognitivefactory-interactive-clustering functionalities on NLP datasets. GitHub repository : https://github.com/cognitivefactory/interactive-clustering-comparative-study/tree/1.0.0 Quick description of Interactive Clustering Interactive clustering is a method intended to assist in the design of a training data set. This iterative process begins with an unlabeled dataset, and it uses a sequence of two substeps : the user defines constraints on data sampled by the machine ; the machine performs data partitioning using a constrained clustering algorithm. Thus, at each step of the process : the user corrects the clustering of the previous steps using constraints, and the machine offers a corrected and more relevant data partitioning for the next step. Description of studies Several studies are provided here: efficience: Aims to confirm the technical efficience of the method by verifying its convergence to a ground truth and by finding the best implementation to increase convergence speed. computation time: Aims to estimate the time needed for algorithms to reach their objectives. annotation time: Aims to estimate the time needed to annotated constraints. constraints number: Aims to estimate the number of constraints needed to have a relevant annotated dataset. relevance: Aims to confirm the relevance of clustering results. rentability: Aims to predict the rentability of one more iteration. inter annotator: Aims to estimate the inter-annotators score during constraints annotation. annotation errors and conflicts fix: Aims to evaluate errors impact and verify conflicts fix importance on labeling. annotation subjectivity: Aims to estimate the labeling difference impact on clustering results. Results All results are zipped in .tar.gz files and versioned on Zenodo: Schild, E. (2021). cognitivefactory/interactive-clustering-comparative-study. Zenodo. https://doi.org/10.5281/zenodo.5648255. Warning ! These experiments can use a huge disk space and ...