Treffer: A teaching proposal for a short course on biomedical data science.
J Healthc Eng. 2021 Jun 9;2021:1004767. (PMID: 34211680)
Biochim Biophys Acta. 1975 Oct 20;405(2):442-51. (PMID: 1180967)
PLoS One. 2018 Aug 30;13(8):e0202947. (PMID: 30161168)
Sci Eng Ethics. 2016 Apr;22(2):303-41. (PMID: 26002496)
J Big Data. 2021;8(1):140. (PMID: 34722113)
BioData Min. 2017 Dec 8;10:35. (PMID: 29234465)
Genetics. 2023 May 4;224(1):. (PMID: 36866529)
PeerJ Comput Sci. 2024 Sep 3;10:e2256. (PMID: 39314688)
PeerJ Comput Sci. 2024 Feb 26;10:e1896. (PMID: 38435625)
Nat Hum Behav. 2018 Jan;2(1):6-10. (PMID: 30980045)
PLoS Comput Biol. 2023 Jan 5;19(1):e1010786. (PMID: 36602949)
PLoS One. 2018 Aug 31;13(8):e0201991. (PMID: 30169521)
J Biomed Inform. 2023 Aug;144:104426. (PMID: 37352899)
IEEE/ACM Trans Comput Biol Bioinform. 2016 Mar-Apr;13(2):248-60. (PMID: 27045825)
IEEE Trans Pattern Anal Mach Intell. 1979 Feb;1(2):224-7. (PMID: 21868852)
Nat Methods. 2021 Oct;18(10):1122-1127. (PMID: 34316068)
Nucleic Acids Res. 2008 Jan;36(Database issue):D440-4. (PMID: 17984083)
PeerJ Comput Sci. 2021 Jul 5;7:e623. (PMID: 34307865)
Clin Teach. 2018 Apr;15(2):104-108. (PMID: 29575667)
PLoS Comput Biol. 2022 Dec 15;18(12):e1010718. (PMID: 36520712)
PLoS One. 2018 Dec 21;13(12):e0209500. (PMID: 30576362)
Sci Rep. 2019 Sep 10;9(1):13036. (PMID: 31506502)
Bioinformatics. 2012 Jan 1;28(1):112-8. (PMID: 22039212)
Nat Biotechnol. 2018 Dec 03;:. (PMID: 30531897)
BioData Min. 2023 Feb 17;16(1):4. (PMID: 36800973)
Artif Intell Med. 2013 May;58(1):63-72. (PMID: 23428358)
PLoS Comput Biol. 2022 Aug 11;18(8):e1010348. (PMID: 35951505)
Elife. 2019 Oct 09;8:. (PMID: 31596231)
N Engl J Med. 2018 Jun 14;378(24):2311-2320. (PMID: 29897847)
Weitere Informationen
As the availability of big biomedical data advances, there is a growing need of university students trained professionally on analyzing these data and correctly interpreting their results. We propose here a study plan for a master's degree course on biomedical data science, by describing our experience during the last academic year. In our university course, we explained how to find an open biomedical dataset, how to correctly clean it and how to prepare it for a computational statistics or machine learning phase. By doing so, we introduce common health data science terms and explained how to avoid common mistakes in the process. Moreover, we clarified how to perform an exploratory data analysis (EDA) and how to reasonably interpret its results. We also described how to properly execute a supervised or unsupervised machine learning analysis, and now to understand and interpret its outcomes. Eventually, we explained how to validate the findings obtained. We illustrated all these steps in the context of open science principles, by suggesting to the students to use only open source programming languages (R or Python in particular), open biomedical data (if available), and open access scientific articles (if possible). We believe our teaching proposal can be useful and of interest for anyone wanting to start to prepare a course on biomedical data science.
(Copyright: © 2025 Chicco and Coelho. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.)
The authors have declared that no competing interests exist.