Treffer: Study of gene expression representation with Treelets and hierarchical clustering algorithms

Title:
Study of gene expression representation with Treelets and hierarchical clustering algorithms
Contributors:
Universitat Politècnica de Catalunya. Departament de Teoria del Senyal i Comunicacions, Salembier Clairon, Philippe Jean
Publisher Information:
Universitat Politècnica de Catalunya
Publication Year:
2011
Collection:
Universitat Politècnica de Catalunya, BarcelonaTech: UPCommons - Global access to UPC knowledge
Document Type:
Dissertation master thesis
File Description:
application/pdf
Language:
English
Rights:
Attribution-NonCommercial-NoDerivs 3.0 Spain ; http://creativecommons.org/licenses/by-nc-nd/3.0/es/ ; Open Access
Accession Number:
edsbas.EE50BF35
Database:
BASE

Weitere Informationen

English: Since the mid-1990's, the field of genomic signal processing has exploded due to the development of DNA microarray technology, which made possible the measurement of mRNA expression of thousands of genes in parallel. Researchers had developed a vast body of knowledge in classification methods. However, microarray data is characterized by extremely high dimensionality and comparatively small number of data points. This makes microarray data analysis quite unique. In this work we have developed various hierarchical clustering algorthims in order to improve the microarray classification task. At first, the original feature set of gene expression values are enriched with new features that are linear combinations of the original ones. These new features are called metagenes and are produced by different proposed hierarchical clustering algorithms. In order to prove the utility of this methodology to classify microarray datasets the building of a reliable classifier via feature selection process is introduced. This methodology has been tested on three public cancer datasets: Colon, Leukemia and Lymphoma. The proposed method has obtained better classification results than if this enhancement is not performed. Confirming the utility of the metagenes generation to improve the final classifier. Secondly, a new technique has been developed in order to use the hierarchical clustering to perform a reduction on the huge microarray datasets, removing the initial genes that will not be relevant for the cancer classification task. The experimental results of this method are also presented and analyzed when it is applied to one public database demonstrating the utility of this new approach. ; Castellano: Desde finales de la década de los años 90, el campo de la genómica fue revolucionado debido al desarrollo de la tecnología de los DNA microarrays. Con ésta técnica es posible medir la expresión de los mRNA de miles de genes en paralelo. Los investigadores han desarrollado un vasto conocimiento en los métodos de ...