diff --git a/docs/index.html b/docs/index.html index 50cdc06..97fb38b 100644 --- a/docs/index.html +++ b/docs/index.html @@ -139,7 +139,6 @@
This course introduces the following:
We also demonstrate how the following concepts are applied in data analysis:
We do not cover the theory and details of these methods as they are covered in other courses.
Throughout the course, we use motivating case studies and data analysis problem sets based on challenges similar to those you encounter in scientific research.
@@ -282,6 +281,10 @@Date | @@ -290,30 +293,6 @@Sep 10 | -Problem Set 1 due | -|||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Sep 13 | -Problem Set 2 due | -||||||||||||||||||
Sep 19 | -Problem Set 3 due | -||||||||||||||||||
Sep 26 | -Problem Set 4 due | -||||||||||||||||||
Sep 26 | -Problem Set 5 due | -||||||||||||||||||
Oct 11 | -Problem Set 6 due | -||||||||||||||||||
Oct 14 | No class: Indigenous Peoples Day | Oct 23 | -Start final project, obtain approval for personal project. | +Start final project. Obtain approval if you want to do personal project instead. | |||||||||||||||
Nov 01 | -Problem Set 7 due | -||||||||||||||||||
Nov 11 | No class: Veterans’ Day | ||||||||||||||||||
Nov 01 | -Problem Set 7 due | -||||||||||||||||||
Nov 22 | -Problem Set 8 due | -||||||||||||||||||
Nov 25 | Midterm 2 | ||||||||||||||||||
Nov 27 | No class: Thanksgiving Recess Begins | ||||||||||||||||||
Dec 06 | -Problem Set 9 due | -||||||||||||||||||
Dec 13 | -Problem Set 10 due | -||||||||||||||||||
Dec 20 | Final Project due | ||||||||||||||||||
Sep 04 | Productivity Tools | -Getting Started, Installing R and RStudio, Unix | +Installing R and RStudio on Windows or Mac, Getting Started Unix | ||||||||||||||||
Sep 09 | @@ -426,7 +385,7 @@Distributions, Dataviz Principles | ||||||||||||||||||
Sep 26 | +Oct 04 | Problem Set 5 due | Difficulty: medium | ||||||||||||||||
Oct 16 | Midterm 1 | -Cover material from Sep 04-Oct 09 | +Covers material from Sep 04-Oct 11 | ||||||||||||||||
Oct 21, Oct 23 | @@ -481,8 +440,8 @@Measurement Error Models, Treatment Effect Models, Association Tests, Association Not Causation | ||||||||||||||||||
Nov 01 | -Problem Set 7 due | +Nov 15 | +Problem Set 8 due | Difficulty: hard | |||||||||||||||
Nov 22 | -Problem Set 8 due | +Problem Set 9 due | Difficulty: easy | ||||||||||||||||
Nov 25 | Midterm 2 | -Cover material from Sep 04-Nov 20 | +Covers material from Sep 04-Nov 22 | ||||||||||||||||
Nov 27 | @@ -508,29 +467,24 @@|||||||||||||||||||
Dec 02, Dec 04 | Machine Learning | -Notation and terminology, [Evaluation Metrics | +Notation and terminology, Evaluation Metrics, conditional probabilities, smoothing | ||||||||||||||||
Dec 06 | -Problem Set 9 due | -Difficulty: easy | -|||||||||||||||||
Dec 09, Dec 11 | Machine Learning | Resampling methods, ML algorithms, ML in practice | |||||||||||||||||
Dec 13 | Problem Set 10 due | Difficulty: hard | |||||||||||||||||
Dec 16, Dec 18 | Other topics | ||||||||||||||||||
Dec 20 | Final Project due | diff --git a/index.qmd b/index.qmd index 2e92318..66e70db 100644 --- a/index.qmd +++ b/index.qmd @@ -20,5 +20,3 @@ -# - diff --git a/syllabus.qmd b/syllabus.qmd index df7f013..6cdc0ac 100644 --- a/syllabus.qmd +++ b/syllabus.qmd @@ -2,10 +2,10 @@ ## Course Information -- **BST 260 Introduction to Data Science** -- **Kresge 202A and 202B (HSPH)** -- **Monday 09:45 AM - 11:15 AM; Wednesday 09:45 AM - 11:15 AM** -- **Lecture notes: [https://datasciencelabs.github.io/2024/](https://datasciencelabs.github.io/2024/)** +- BST 260 Introduction to Data Science +- Kresge 202A and 202B (HSPH) +- Monday 09:45 AM - 11:15 AM; Wednesday 09:45 AM - 11:15 AM +- Lecture notes: [https://datasciencelabs.github.io/2024/](https://datasciencelabs.github.io/2024/) ## Prerequisites @@ -23,18 +23,18 @@ Students not matriculated in an HSPH Biostatistics graduate program (HDS SM60, B This course introduces the following: * UNIX/Linux shell. -* Reproducible document preparation with RStudio, knitr, and markdown. -* Version control with git and GitHub. -* R programming, -* Data wrangling with dplyr and data.table. -* Data visualization with ggplot2. +* Reproducible document preparation with RStudio, knitr, and markdown +* Version control with git and GitHub +* R programming +* Data wrangling with dplyr and data.table +* Data visualization with ggplot2 We also demonstrate how the following concepts are applied in data analysis: -* Monte Carlo simulations. -* Statistical modeling. -* High-dimensional data techniques, and -* Machine learning. +* Monte Carlo simulations +* Statistical modeling +* High-dimensional data techniques +* Machine learning We do not cover the theory and details of these methods as they are covered in other courses. @@ -103,30 +103,19 @@ You can use ChatGPT however you want. Do remember **you won't be able to use it | Date | Event | |------|-------| -| Sep 10 | Problem Set 1 due| -| Sep 13 | Problem Set 2 due| -| Sep 19 | Problem Set 3 due| -| Sep 26 | Problem Set 4 due| -| Sep 26 | Problem Set 5 due| -| Oct 11 | Problem Set 6 due| | Oct 14 | No class: Indigenous Peoples Day | | Oct 16 | Midterm 1| -| Oct 23 | Start final project, obtain approval for personal project.| -| Nov 01 | Problem Set 7 due | +| Oct 23 | Start final project. Obtain approval if you want to do personal project instead.| | Nov 11 | No class: Veterans' Day| -| Nov 01 | Problem Set 7 due | -| Nov 22 | Problem Set 8 due| | Nov 25 | Midterm 2| | Nov 27 | No class: Thanksgiving Recess Begins | -| Dec 06 | Problem Set 9 due| -| Dec 13 | Problem Set 10 due| | Dec 20 | Final Project due| ## Preliminary Schedule | Dates | Topic | Links to readings and notes | |:-------------------|:---------|:----------| -| Sep 04 | Productivity Tools | [Getting Started](http://rafalab.dfci.harvard.edu/dsbook-part-1/R/getting-started.html), [Installing R and RStudio](http://rafalab.dfci.harvard.edu/dsbook-part-1/R/installing-r-and-rstudio.html), [Unix](http://rafalab.dfci.harvard.edu/dsbook-part-1/productivity/unix.html) | +| Sep 04 | Productivity Tools | Installing R and RStudio on [Windows](https://teacherscollege.screenstepslive.com/a/1108074-install-r-and-rstudio-for-windows) or [Mac](https://teacherscollege.screenstepslive.com/a/1135059-install-r-and-rstudio-for-mac), [Getting Started](http://rafalab.dfci.harvard.edu/dsbook-part-1/R/getting-started.html) [Unix](http://rafalab.dfci.harvard.edu/dsbook-part-1/productivity/unix.html) | | Sep 09 | Productivity Tools | [RStudio Projects, Quarto](https://rafalab.dfci.harvard.edu/dsbook-part-1/productivity/reproducible-projects.html) [Git and GitHub](http://rafalab.dfci.harvard.edu/dsbook-part-1/productivity/git.html) | | Sep 10 | **Problem Set 1 due**| Difficulty: easy| | Sep 11 | R | [R Basics](http://rafalab.dfci.harvard.edu/dsbook-part-1/R/R-basics.html), [Vectorization](http://rafalab.dfci.harvard.edu/dsbook-part-1/R/programming-basics.html#sec-vectorization) | @@ -136,24 +125,23 @@ You can use ChatGPT however you want. Do remember **you won't be able to use it | Sep 23, Sep 25 | Wrangling | [Importing data](https://rafalab.dfci.harvard.edu/dsbook-part-1/R/importing-data.html) [Locales](https://rafalab.dfci.harvard.edu/dsbook-part-1/wrangling/locales.html) [Reshaping Data](http://rafalab.dfci.harvard.edu/dsbook-part-1/wrangling/reshaping-data.html), [Joining Tables](http://rafalab.dfci.harvard.edu/dsbook-part-1/wrangling/joining-tables.html), [Extracting data from the web](https://rafalab.dfci.harvard.edu/dsbook-part-1/wrangling/web-scraping.html)| | Sep 26 | **Problem Set 4 due**| Difficulty: medium| | Sep 30, Oct 02 | Data visualization | [Distributions](http://rafalab.dfci.harvard.edu/dsbook-part-1/dataviz/distributions.html), [Dataviz Principles](http://rafalab.dfci.harvard.edu/dsbook-part-1/dataviz/dataviz-principles.html) | -| Sep 26 | **Problem Set 5 due**| Difficulty: medium| +| Oct 04 | **Problem Set 5 due**| Difficulty: medium| | Oct 07, Oct 09 | Probability | [Monte Carlo](http://rafalab.dfci.harvard.edu/dsbook-part-2/prob/continuous-probability.html#monte-carlo), [Random Variables & CLT](http://rafalab.dfci.harvard.edu/dsbook-part-2/prob/random-variables-sampling-models-clt.html)| | Oct 11 | **Problem Set 6 due**| Difficulty: easy| | Oct 14 | No class | Indigenous Peoples Day | -| Oct 16 | **Midterm 1**| Cover material from Sep 04-Oct 09| +| Oct 16 | **Midterm 1**| Covers material from Sep 04-Oct 11| | Oct 21, Oct 23 | Inference | [Parameters & Estimates](http://rafalab.dfci.harvard.edu/dsbook-part-2/inference/parameters-estimates.html), [Confidence Intervals](http://rafalab.dfci.harvard.edu/dsbook-part-2/inference/confidence-intervals.html)| | Oct 28, Oct 30 | Statistical Models | [Data-driven Models](http://rafalab.dfci.harvard.edu/dsbook-part-2/inference/models.html), [Bayesian Statistics](http://rafalab.dfci.harvard.edu/dsbook-part-2/inference/bayes.html), [Hierarchical Models](http://rafalab.dfci.harvard.edu/dsbook-part-2/inference/hierarchical-models.html) | | Nov 01 | **Problem Set 7 due** | Difficulty: hard | | Nov 04, Nov 06 | Linear models | [Regression](http://rafalab.dfci.harvard.edu/dsbook-part-2/linear-models/regression.html), [Multivariate Regression](http://rafalab.dfci.harvard.edu/dsbook-part-2/linear-models/multivariate-regression.html)| | Nov 11 | No class| Veterans' Day| | Nov 13 | Linear models | [Measurement Error Models](http://rafalab.dfci.harvard.edu/dsbook-part-2/linear-models/measurement-error-models.html), [Treatment Effect Models](http://rafalab.dfci.harvard.edu/dsbook-part-2/linear-models/treatment-effect-models.html), [Association Tests](http://rafalab.dfci.harvard.edu/dsbook-part-2/linear-models/association-tests.html), [Association Not Causation](http://rafalab.dfci.harvard.edu/dsbook-part-2/linear-models/association-not-causation.html) | -| Nov 01 | **Problem Set 7 due** |Difficulty: hard| +| Nov 15 | **Problem Set 8 due** |Difficulty: hard| | Nov 18, Nov 20| High dimensional data | [Matrices in R](https://rafalab.dfci.harvard.edu/dsbook-part-2/highdim/matrices-in-R.html), [Applied Linear Algebra](https://rafalab.dfci.harvard.edu/dsbook-part-2/highdim/linear-algebra.html), [Dimension Reduction](https://rafalab.dfci.harvard.edu/dsbook-part-2/highdim/dimension-reduction.html) | -| Nov 22 | **Problem Set 8 due**| Difficulty: easy| -| Nov 25 | **Midterm 2**| Cover material from Sep 04-Nov 20 | +| Nov 22 | **Problem Set 9 due**| Difficulty: easy| +| Nov 25 | **Midterm 2**| Covers material from Sep 04-Nov 22 | | Nov 27 | No class |Thanksgiving Recess Begins | -| Dec 02, Dec 04 | Machine Learning | [Notation and terminology](https://rafalab.dfci.harvard.edu/dsbook-part-2/ml/notation-and-terminology.html), [Evaluation Metrics|(https://rafalab.dfci.harvard.edu/dsbook-part-2/ml/evaluation-metrics.html), [conditional probabilities](https://rafalab.dfci.harvard.edu/dsbook-part-2/ml/conditionals.html), [smoothing](https://rafalab.dfci.harvard.edu/dsbook-part-2/ml/smoothing.html) -| Dec 06 | **Problem Set 9 due**| Difficulty: easy| +| Dec 02, Dec 04 | Machine Learning | [Notation and terminology](https://rafalab.dfci.harvard.edu/dsbook-part-2/ml/notation-and-terminology.html), [Evaluation Metrics](https://rafalab.dfci.harvard.edu/dsbook-part-2/ml/evaluation-metrics.html), [conditional probabilities](https://rafalab.dfci.harvard.edu/dsbook-part-2/ml/conditionals.html), [smoothing](https://rafalab.dfci.harvard.edu/dsbook-part-2/ml/smoothing.html) | | Dec 09, Dec 11 | Machine Learning | [Resampling methods](https://rafalab.dfci.harvard.edu/dsbook-part-2/ml/resampling-methods.html), [ML algorithms](https://rafalab.dfci.harvard.edu/dsbook-part-2/ml/algorithms.html), [ML in practice](https://rafalab.dfci.harvard.edu/dsbook-part-2/ml/ml-in-practice.html) | | Dec 13 | **Problem Set 10 due**| Difficulty: hard| | Dec 16, Dec 18 | Other topics | | |