Treffer: The Essential Toolbox of Data Science: Python, R, Git, and Docker.

Title:
The Essential Toolbox of Data Science: Python, R, Git, and Docker.
Authors:
Pittard WS; Department of Biostatistics and Bioinformatics, Rollins School of Public Health, Emory University, Atlanta, GA, USA., Li S; Department of Medicine, Emory University School of Medicine, Atlanta, GA, USA. shuzhao.li@gmail.com.
Source:
Methods in molecular biology (Clifton, N.J.) [Methods Mol Biol] 2020; Vol. 2104, pp. 265-311.
Publication Type:
Journal Article; Research Support, N.I.H., Extramural; Research Support, U.S. Gov't, Non-P.H.S.
Language:
English
Journal Info:
Publisher: Humana Press Country of Publication: United States NLM ID: 9214969 Publication Model: Print Cited Medium: Internet ISSN: 1940-6029 (Electronic) Linking ISSN: 10643745 NLM ISO Abbreviation: Methods Mol Biol Subsets: MEDLINE
Imprint Name(s):
Publication: Totowa, NJ : Humana Press
Original Publication: Clifton, N.J. : Humana Press,
Grant Information:
U2C ES030163 United States ES NIEHS NIH HHS; U2C ES026560 United States ES NIEHS NIH HHS; P50 ES026071 United States ES NIEHS NIH HHS; P30 ES019776 United States ES NIEHS NIH HHS; UH2 AI132345 United States AI NIAID NIH HHS; U01 CA235493 United States CA NCI NIH HHS
Contributed Indexing:
Keywords: Bioinformatics; Data science; Docker; Git; Python; R; Version control; Virtualization
Entry Date(s):
Date Created: 20200119 Date Completed: 20210118 Latest Revision: 20210118
Update Code:
20250114
DOI:
10.1007/978-1-0716-0239-3_15
PMID:
31953823
Database:
MEDLINE

Weitere Informationen

The daily work in data science involves a set of essential tools: the programming languages Python and R, the version control tool Git and the virtualization tool Docker. Proficiency in at least one programming language is required for data science. R is tied to a computing environment that focuses on statistics, in which many new algorithms in genomics and biomedicine are first published. Python has a root in system administration, and is a superb language for general programming. Version control is critical to managing complex projects, even if software development is not involved. Docker container is becoming a key tool for deployment, portability, and reproducibility. This chapter provides a self-contained practical guide of these topics so that readers can use it as a reference and to plan their training.