Treffer: LDmat: efficiently queryable compression of linkage disequilibrium matrices.
Brief Bioinform. 2004 Dec;5(4):355-64. (PMID: 15606972)
Nature. 2020 Sep;585(7825):357-362. (PMID: 32939066)
Nat Rev Genet. 2008 Jun;9(6):477-85. (PMID: 18427557)
Nat Genet. 2020 Dec;52(12):1355-1363. (PMID: 33199916)
Bioinformatics. 2011 Mar 1;27(5):718-9. (PMID: 21208982)
Anim Genet. 2014 Oct;45(5):754-7. (PMID: 25040320)
Methods Mol Biol. 2007;376:1-15. (PMID: 17984534)
Nat Genet. 2015 Mar;47(3):291-5. (PMID: 25642630)
Science. 2002 Jun 21;296(5576):2225-9. (PMID: 12029063)
Front Genet. 2020 Feb 28;11:157. (PMID: 32180801)
Am J Hum Genet. 2017 Oct 5;101(4):539-551. (PMID: 28942963)
Weitere Informationen
Motivation: Linkage disequilibrium (LD) matrices derived from large populations are widely used in population genetics in fine-mapping, LD score regression, and linear mixed models for Genome-wide Association Studies (GWAS). However, these matrices can reach large sizes when they are derived from millions of individuals; hence, moving, sharing and extracting granular information from this large amount of data can be cumbersome.
Results: We sought to address the need for compressing and easily querying large LD matrices by developing LDmat. LDmat is a standalone tool to compress large LD matrices in an HDF5 file format and query these compressed matrices. It can extract submatrices corresponding to a sub-region of the genome, a list of select loci, and loci within a minor allele frequency range. LDmat can also rebuild the original file formats from the compressed files.
Availability and Implementation: LDmat is implemented in python, and can be installed on Unix systems with the command 'pip install ldmat'. It can also be accessed through https://github.com/G2Lab/ldmat and https://pypi.org/project/ldmat/.
Supplementary Information: Supplementary data are available at Bioinformatics online.
(© The Author(s) 2023. Published by Oxford University Press.)