Treffer: Histogram based Hierarchical Data Representation for Microarray Classification ; Histograma basado en Representación jerárquica para datos de Microarray Clasificación ; Histograma basat en Representació jeràrquica per dades de Microarray Classificació

Title:
Histogram based Hierarchical Data Representation for Microarray Classification ; Histograma basado en Representación jerárquica para datos de Microarray Clasificación ; Histograma basat en Representació jeràrquica per dades de Microarray Classificació
Authors:
Contributors:
Universitat Politècnica de Catalunya. Departament de Teoria del Senyal i Comunicacions, Salembier Clairon, Philippe Jean
Publisher Information:
Universitat Politècnica de Catalunya
Publication Year:
2012
Collection:
Universitat Politècnica de Catalunya, BarcelonaTech: UPCommons - Global access to UPC knowledge
Document Type:
Dissertation master thesis
File Description:
application/pdf
Language:
English
Rights:
Attribution-NonCommercial-NoDerivs 3.0 Spain ; http://creativecommons.org/licenses/by-nc-nd/3.0/es/ ; Open Access
Accession Number:
edsbas.D98E4F07
Database:
BASE

Weitere Informationen

[ANGLÈS] A general framework for microarray classification relying on histogram based hierarchical clustering is proposed in this work. It produces precise and reliable classifiers based on a two-step approach. In the first step, the feature set is enhanced by histogram based features corresponding to each cluster produced via hierarchical clustering, where a parameter (maximum number of dominant genes) can be tuned based on the dataset characteristics. In the second step, a reliable classifier is built from a wrapper feature selection process called Improved Sequential Floating Forward Selection (IFFS) to properly choose a small feature set for the classification task. Considering the sample scarcity in the microarray datasets, a reliability parameter has been considered to improve the feature selection process along with classification error rate. Different combinations of error rate and reliability has been used as the scoring rule. Linear Discriminant Analysis (LDA) and K-Nearest Neighbour (KNN) classifiers have been used for this work and the performances has been compared. The potential of the proposed framework has been evaluated with three publicly available datasets : colon, lymphoma and leukaemia. The experimental results have confirmed the usefulness of the histogram based hierarchical clustering and the new representative feature generation algorithm. A gene level analysis has revealed that the best features selected by the feature selection algorithm has only very few basic constituent genes involved. The comparative results showed that the proposed framework can compete with state of the art alternatives. ; [CASTELLÀ] Un marco general para la clasificación de microarrays se propone en este trabajo. Produce clasificadores precisos y fiables basados en un enfoque de dos pasos. En el primer paso, el conjunto de características se ve reforzado por una serie de características basado en un histograma correspondiente a cada racimo producido a través de la agrupación jerárquica, donde puede ser un ...