diff --git a/docs/index.html b/docs/index.html index 50cdc06..97fb38b 100644 --- a/docs/index.html +++ b/docs/index.html @@ -139,7 +139,6 @@

Table of contents

-
  • @@ -198,12 +197,9 @@

    Instructors

  • Nikhil Vytla
  • Yuan Wang
  • - - -
    -

    +
    diff --git a/docs/search.json b/docs/search.json index 964567d..3dfe9c0 100644 --- a/docs/search.json +++ b/docs/search.json @@ -53,7 +53,7 @@ "href": "syllabus.html#course-description", "title": "Syllabus", "section": "Course Description", - "text": "Course Description\nThis course introduces the following:\n\nUNIX/Linux shell.\nReproducible document preparation with RStudio, knitr, and markdown.\n\nVersion control with git and GitHub.\nR programming,\nData wrangling with dplyr and data.table.\nData visualization with ggplot2.\n\nWe also demonstrate how the following concepts are applied in data analysis:\n\nMonte Carlo simulations.\nStatistical modeling.\nHigh-dimensional data techniques, and\nMachine learning.\n\nWe do not cover the theory and details of these methods as they are covered in other courses.\nThroughout the course, we use motivating case studies and data analysis problem sets based on challenges similar to those you encounter in scientific research." + "text": "Course Description\nThis course introduces the following:\n\nUNIX/Linux shell.\nReproducible document preparation with RStudio, knitr, and markdown\nVersion control with git and GitHub\nR programming\nData wrangling with dplyr and data.table\nData visualization with ggplot2\n\nWe also demonstrate how the following concepts are applied in data analysis:\n\nMonte Carlo simulations\nStatistical modeling\nHigh-dimensional data techniques\nMachine learning\n\nWe do not cover the theory and details of these methods as they are covered in other courses.\nThroughout the course, we use motivating case studies and data analysis problem sets based on challenges similar to those you encounter in scientific research." }, { "objectID": "syllabus.html#weekly-course-structure", @@ -102,13 +102,13 @@ "href": "syllabus.html#key-dates---subject-to-change-after-first-week-of-class", "title": "Syllabus", "section": "Key Dates - Subject to Change After First Week of Class", - "text": "Key Dates - Subject to Change After First Week of Class\n\n\n\nDate\nEvent\n\n\n\n\nSep 10\nProblem Set 1 due\n\n\nSep 13\nProblem Set 2 due\n\n\nSep 19\nProblem Set 3 due\n\n\nSep 26\nProblem Set 4 due\n\n\nSep 26\nProblem Set 5 due\n\n\nOct 11\nProblem Set 6 due\n\n\nOct 14\nNo class: Indigenous Peoples Day\n\n\nOct 16\nMidterm 1\n\n\nOct 23\nStart final project, obtain approval for personal project.\n\n\nNov 01\nProblem Set 7 due\n\n\nNov 11\nNo class: Veterans’ Day\n\n\nNov 01\nProblem Set 7 due\n\n\nNov 22\nProblem Set 8 due\n\n\nNov 25\nMidterm 2\n\n\nNov 27\nNo class: Thanksgiving Recess Begins\n\n\nDec 06\nProblem Set 9 due\n\n\nDec 13\nProblem Set 10 due\n\n\nDec 20\nFinal Project due" + "text": "Key Dates - Subject to Change After First Week of Class\n\n\n\n\n\n\n\nDate\nEvent\n\n\n\n\nOct 14\nNo class: Indigenous Peoples Day\n\n\nOct 16\nMidterm 1\n\n\nOct 23\nStart final project. Obtain approval if you want to do personal project instead.\n\n\nNov 11\nNo class: Veterans’ Day\n\n\nNov 25\nMidterm 2\n\n\nNov 27\nNo class: Thanksgiving Recess Begins\n\n\nDec 20\nFinal Project due" }, { "objectID": "syllabus.html#preliminary-schedule", "href": "syllabus.html#preliminary-schedule", "title": "Syllabus", "section": "Preliminary Schedule", - "text": "Preliminary Schedule\n\n\n\nDates\nTopic\nLinks to readings and notes\n\n\n\n\nSep 04\nProductivity Tools\nGetting Started, Installing R and RStudio, Unix\n\n\nSep 09\nProductivity Tools\nRStudio Projects, Quarto Git and GitHub\n\n\nSep 10\nProblem Set 1 due\nDifficulty: easy\n\n\nSep 11\nR\nR Basics, Vectorization\n\n\nSep 13\nProblem Set 2 due\nDifficulty: easy\n\n\nSep 16, Sep 18\nR\ndplyr, dates and times, ggplot2\n\n\nSep 19\nProblem Set 3 due\nDifficulty: easy\n\n\nSep 23, Sep 25\nWrangling\nImporting data Locales Reshaping Data, Joining Tables, Extracting data from the web\n\n\nSep 26\nProblem Set 4 due\nDifficulty: medium\n\n\nSep 30, Oct 02\nData visualization\nDistributions, Dataviz Principles\n\n\nSep 26\nProblem Set 5 due\nDifficulty: medium\n\n\nOct 07, Oct 09\nProbability\nMonte Carlo, Random Variables & CLT\n\n\nOct 11\nProblem Set 6 due\nDifficulty: easy\n\n\nOct 14\nNo class\nIndigenous Peoples Day\n\n\nOct 16\nMidterm 1\nCover material from Sep 04-Oct 09\n\n\nOct 21, Oct 23\nInference\nParameters & Estimates, Confidence Intervals\n\n\nOct 28, Oct 30\nStatistical Models\nData-driven Models, Bayesian Statistics, Hierarchical Models\n\n\nNov 01\nProblem Set 7 due\nDifficulty: hard\n\n\nNov 04, Nov 06\nLinear models\nRegression, Multivariate Regression\n\n\nNov 11\nNo class\nVeterans’ Day\n\n\nNov 13\nLinear models\nMeasurement Error Models, Treatment Effect Models, Association Tests, Association Not Causation\n\n\nNov 01\nProblem Set 7 due\nDifficulty: hard\n\n\nNov 18, Nov 20\nHigh dimensional data\nMatrices in R, Applied Linear Algebra, Dimension Reduction\n\n\nNov 22\nProblem Set 8 due\nDifficulty: easy\n\n\nNov 25\nMidterm 2\nCover material from Sep 04-Nov 20\n\n\nNov 27\nNo class\nThanksgiving Recess Begins\n\n\nDec 02, Dec 04\nMachine Learning\nNotation and terminology, [Evaluation Metrics\n\n\nDec 06\nProblem Set 9 due\nDifficulty: easy\n\n\nDec 09, Dec 11\nMachine Learning\nResampling methods, ML algorithms, ML in practice\n\n\nDec 13\nProblem Set 10 due\nDifficulty: hard\n\n\nDec 16, Dec 18\nOther topics\n\n\n\nDec 20\nFinal Project due" + "text": "Preliminary Schedule\n\n\n\nDates\nTopic\nLinks to readings and notes\n\n\n\n\nSep 04\nProductivity Tools\nInstalling R and RStudio on Windows or Mac, Getting Started Unix\n\n\nSep 09\nProductivity Tools\nRStudio Projects, Quarto Git and GitHub\n\n\nSep 10\nProblem Set 1 due\nDifficulty: easy\n\n\nSep 11\nR\nR Basics, Vectorization\n\n\nSep 13\nProblem Set 2 due\nDifficulty: easy\n\n\nSep 16, Sep 18\nR\ndplyr, dates and times, ggplot2\n\n\nSep 19\nProblem Set 3 due\nDifficulty: easy\n\n\nSep 23, Sep 25\nWrangling\nImporting data Locales Reshaping Data, Joining Tables, Extracting data from the web\n\n\nSep 26\nProblem Set 4 due\nDifficulty: medium\n\n\nSep 30, Oct 02\nData visualization\nDistributions, Dataviz Principles\n\n\nOct 04\nProblem Set 5 due\nDifficulty: medium\n\n\nOct 07, Oct 09\nProbability\nMonte Carlo, Random Variables & CLT\n\n\nOct 11\nProblem Set 6 due\nDifficulty: easy\n\n\nOct 14\nNo class\nIndigenous Peoples Day\n\n\nOct 16\nMidterm 1\nCovers material from Sep 04-Oct 11\n\n\nOct 21, Oct 23\nInference\nParameters & Estimates, Confidence Intervals\n\n\nOct 28, Oct 30\nStatistical Models\nData-driven Models, Bayesian Statistics, Hierarchical Models\n\n\nNov 01\nProblem Set 7 due\nDifficulty: hard\n\n\nNov 04, Nov 06\nLinear models\nRegression, Multivariate Regression\n\n\nNov 11\nNo class\nVeterans’ Day\n\n\nNov 13\nLinear models\nMeasurement Error Models, Treatment Effect Models, Association Tests, Association Not Causation\n\n\nNov 15\nProblem Set 8 due\nDifficulty: hard\n\n\nNov 18, Nov 20\nHigh dimensional data\nMatrices in R, Applied Linear Algebra, Dimension Reduction\n\n\nNov 22\nProblem Set 9 due\nDifficulty: easy\n\n\nNov 25\nMidterm 2\nCovers material from Sep 04-Nov 22\n\n\nNov 27\nNo class\nThanksgiving Recess Begins\n\n\nDec 02, Dec 04\nMachine Learning\nNotation and terminology, Evaluation Metrics, conditional probabilities, smoothing\n\n\nDec 09, Dec 11\nMachine Learning\nResampling methods, ML algorithms, ML in practice\n\n\nDec 13\nProblem Set 10 due\nDifficulty: hard\n\n\nDec 16, Dec 18\nOther topics\n\n\n\nDec 20\nFinal Project due" } ] \ No newline at end of file diff --git a/docs/sitemap.xml b/docs/sitemap.xml index a7c4e1a..a3c1b9a 100644 --- a/docs/sitemap.xml +++ b/docs/sitemap.xml @@ -2,14 +2,14 @@ http://datasciencelabs.github.io/2024/index.html - 2024-09-02T20:28:45.364Z + 2024-09-03T02:25:22.113Z http://datasciencelabs.github.io/2024/downloading-course-materials.html - 2024-09-02T20:28:45.367Z + 2024-09-03T02:25:22.116Z http://datasciencelabs.github.io/2024/syllabus.html - 2024-09-02T20:28:45.373Z + 2024-09-03T02:31:49.342Z diff --git a/docs/syllabus.html b/docs/syllabus.html index de0dbfd..c058918 100644 --- a/docs/syllabus.html +++ b/docs/syllabus.html @@ -171,10 +171,10 @@

    Syllabus

    Course Information

    @@ -194,19 +194,18 @@

    Course Description

    This course introduces the following:

    We also demonstrate how the following concepts are applied in data analysis:

    We do not cover the theory and details of these methods as they are covered in other courses.

    Throughout the course, we use motivating case studies and data analysis problem sets based on challenges similar to those you encounter in scientific research.

    @@ -282,6 +281,10 @@

    ChatGPT Policy

    Key Dates - Subject to Change After First Week of Class

    ++++ @@ -290,30 +293,6 @@

    -

    - - - - - - - - - - - - - - - - - - - - - - - @@ -323,41 +302,21 @@

    - + - - - - - - - - - - - - - + - - - - - - - - @@ -378,7 +337,7 @@

    Preliminary Schedule<

    - + @@ -426,7 +385,7 @@

    Preliminary Schedule<

    - + @@ -448,7 +407,7 @@

    Preliminary Schedule<

    - + @@ -481,8 +440,8 @@

    Preliminary Schedule<

    - - + + @@ -492,13 +451,13 @@

    Preliminary Schedule<

    - + - + @@ -508,29 +467,24 @@

    Preliminary Schedule<

    - + - - - - - - + - + - + diff --git a/index.qmd b/index.qmd index 2e92318..66e70db 100644 --- a/index.qmd +++ b/index.qmd @@ -20,5 +20,3 @@ -# - diff --git a/syllabus.qmd b/syllabus.qmd index df7f013..6cdc0ac 100644 --- a/syllabus.qmd +++ b/syllabus.qmd @@ -2,10 +2,10 @@ ## Course Information -- **BST 260 Introduction to Data Science** -- **Kresge 202A and 202B (HSPH)** -- **Monday 09:45 AM - 11:15 AM; Wednesday 09:45 AM - 11:15 AM** -- **Lecture notes: [https://datasciencelabs.github.io/2024/](https://datasciencelabs.github.io/2024/)** +- BST 260 Introduction to Data Science +- Kresge 202A and 202B (HSPH) +- Monday 09:45 AM - 11:15 AM; Wednesday 09:45 AM - 11:15 AM +- Lecture notes: [https://datasciencelabs.github.io/2024/](https://datasciencelabs.github.io/2024/) ## Prerequisites @@ -23,18 +23,18 @@ Students not matriculated in an HSPH Biostatistics graduate program (HDS SM60, B This course introduces the following: * UNIX/Linux shell. -* Reproducible document preparation with RStudio, knitr, and markdown. -* Version control with git and GitHub. -* R programming, -* Data wrangling with dplyr and data.table. -* Data visualization with ggplot2. +* Reproducible document preparation with RStudio, knitr, and markdown +* Version control with git and GitHub +* R programming +* Data wrangling with dplyr and data.table +* Data visualization with ggplot2 We also demonstrate how the following concepts are applied in data analysis: -* Monte Carlo simulations. -* Statistical modeling. -* High-dimensional data techniques, and -* Machine learning. +* Monte Carlo simulations +* Statistical modeling +* High-dimensional data techniques +* Machine learning We do not cover the theory and details of these methods as they are covered in other courses. @@ -103,30 +103,19 @@ You can use ChatGPT however you want. Do remember **you won't be able to use it | Date | Event | |------|-------| -| Sep 10 | Problem Set 1 due| -| Sep 13 | Problem Set 2 due| -| Sep 19 | Problem Set 3 due| -| Sep 26 | Problem Set 4 due| -| Sep 26 | Problem Set 5 due| -| Oct 11 | Problem Set 6 due| | Oct 14 | No class: Indigenous Peoples Day | | Oct 16 | Midterm 1| -| Oct 23 | Start final project, obtain approval for personal project.| -| Nov 01 | Problem Set 7 due | +| Oct 23 | Start final project. Obtain approval if you want to do personal project instead.| | Nov 11 | No class: Veterans' Day| -| Nov 01 | Problem Set 7 due | -| Nov 22 | Problem Set 8 due| | Nov 25 | Midterm 2| | Nov 27 | No class: Thanksgiving Recess Begins | -| Dec 06 | Problem Set 9 due| -| Dec 13 | Problem Set 10 due| | Dec 20 | Final Project due| ## Preliminary Schedule | Dates | Topic | Links to readings and notes | |:-------------------|:---------|:----------| -| Sep 04 | Productivity Tools | [Getting Started](http://rafalab.dfci.harvard.edu/dsbook-part-1/R/getting-started.html), [Installing R and RStudio](http://rafalab.dfci.harvard.edu/dsbook-part-1/R/installing-r-and-rstudio.html), [Unix](http://rafalab.dfci.harvard.edu/dsbook-part-1/productivity/unix.html) | +| Sep 04 | Productivity Tools | Installing R and RStudio on [Windows](https://teacherscollege.screenstepslive.com/a/1108074-install-r-and-rstudio-for-windows) or [Mac](https://teacherscollege.screenstepslive.com/a/1135059-install-r-and-rstudio-for-mac), [Getting Started](http://rafalab.dfci.harvard.edu/dsbook-part-1/R/getting-started.html) [Unix](http://rafalab.dfci.harvard.edu/dsbook-part-1/productivity/unix.html) | | Sep 09 | Productivity Tools | [RStudio Projects, Quarto](https://rafalab.dfci.harvard.edu/dsbook-part-1/productivity/reproducible-projects.html) [Git and GitHub](http://rafalab.dfci.harvard.edu/dsbook-part-1/productivity/git.html) | | Sep 10 | **Problem Set 1 due**| Difficulty: easy| | Sep 11 | R | [R Basics](http://rafalab.dfci.harvard.edu/dsbook-part-1/R/R-basics.html), [Vectorization](http://rafalab.dfci.harvard.edu/dsbook-part-1/R/programming-basics.html#sec-vectorization) | @@ -136,24 +125,23 @@ You can use ChatGPT however you want. Do remember **you won't be able to use it | Sep 23, Sep 25 | Wrangling | [Importing data](https://rafalab.dfci.harvard.edu/dsbook-part-1/R/importing-data.html) [Locales](https://rafalab.dfci.harvard.edu/dsbook-part-1/wrangling/locales.html) [Reshaping Data](http://rafalab.dfci.harvard.edu/dsbook-part-1/wrangling/reshaping-data.html), [Joining Tables](http://rafalab.dfci.harvard.edu/dsbook-part-1/wrangling/joining-tables.html), [Extracting data from the web](https://rafalab.dfci.harvard.edu/dsbook-part-1/wrangling/web-scraping.html)| | Sep 26 | **Problem Set 4 due**| Difficulty: medium| | Sep 30, Oct 02 | Data visualization | [Distributions](http://rafalab.dfci.harvard.edu/dsbook-part-1/dataviz/distributions.html), [Dataviz Principles](http://rafalab.dfci.harvard.edu/dsbook-part-1/dataviz/dataviz-principles.html) | -| Sep 26 | **Problem Set 5 due**| Difficulty: medium| +| Oct 04 | **Problem Set 5 due**| Difficulty: medium| | Oct 07, Oct 09 | Probability | [Monte Carlo](http://rafalab.dfci.harvard.edu/dsbook-part-2/prob/continuous-probability.html#monte-carlo), [Random Variables & CLT](http://rafalab.dfci.harvard.edu/dsbook-part-2/prob/random-variables-sampling-models-clt.html)| | Oct 11 | **Problem Set 6 due**| Difficulty: easy| | Oct 14 | No class | Indigenous Peoples Day | -| Oct 16 | **Midterm 1**| Cover material from Sep 04-Oct 09| +| Oct 16 | **Midterm 1**| Covers material from Sep 04-Oct 11| | Oct 21, Oct 23 | Inference | [Parameters & Estimates](http://rafalab.dfci.harvard.edu/dsbook-part-2/inference/parameters-estimates.html), [Confidence Intervals](http://rafalab.dfci.harvard.edu/dsbook-part-2/inference/confidence-intervals.html)| | Oct 28, Oct 30 | Statistical Models | [Data-driven Models](http://rafalab.dfci.harvard.edu/dsbook-part-2/inference/models.html), [Bayesian Statistics](http://rafalab.dfci.harvard.edu/dsbook-part-2/inference/bayes.html), [Hierarchical Models](http://rafalab.dfci.harvard.edu/dsbook-part-2/inference/hierarchical-models.html) | | Nov 01 | **Problem Set 7 due** | Difficulty: hard | | Nov 04, Nov 06 | Linear models | [Regression](http://rafalab.dfci.harvard.edu/dsbook-part-2/linear-models/regression.html), [Multivariate Regression](http://rafalab.dfci.harvard.edu/dsbook-part-2/linear-models/multivariate-regression.html)| | Nov 11 | No class| Veterans' Day| | Nov 13 | Linear models | [Measurement Error Models](http://rafalab.dfci.harvard.edu/dsbook-part-2/linear-models/measurement-error-models.html), [Treatment Effect Models](http://rafalab.dfci.harvard.edu/dsbook-part-2/linear-models/treatment-effect-models.html), [Association Tests](http://rafalab.dfci.harvard.edu/dsbook-part-2/linear-models/association-tests.html), [Association Not Causation](http://rafalab.dfci.harvard.edu/dsbook-part-2/linear-models/association-not-causation.html) | -| Nov 01 | **Problem Set 7 due** |Difficulty: hard| +| Nov 15 | **Problem Set 8 due** |Difficulty: hard| | Nov 18, Nov 20| High dimensional data | [Matrices in R](https://rafalab.dfci.harvard.edu/dsbook-part-2/highdim/matrices-in-R.html), [Applied Linear Algebra](https://rafalab.dfci.harvard.edu/dsbook-part-2/highdim/linear-algebra.html), [Dimension Reduction](https://rafalab.dfci.harvard.edu/dsbook-part-2/highdim/dimension-reduction.html) | -| Nov 22 | **Problem Set 8 due**| Difficulty: easy| -| Nov 25 | **Midterm 2**| Cover material from Sep 04-Nov 20 | +| Nov 22 | **Problem Set 9 due**| Difficulty: easy| +| Nov 25 | **Midterm 2**| Covers material from Sep 04-Nov 22 | | Nov 27 | No class |Thanksgiving Recess Begins | -| Dec 02, Dec 04 | Machine Learning | [Notation and terminology](https://rafalab.dfci.harvard.edu/dsbook-part-2/ml/notation-and-terminology.html), [Evaluation Metrics|(https://rafalab.dfci.harvard.edu/dsbook-part-2/ml/evaluation-metrics.html), [conditional probabilities](https://rafalab.dfci.harvard.edu/dsbook-part-2/ml/conditionals.html), [smoothing](https://rafalab.dfci.harvard.edu/dsbook-part-2/ml/smoothing.html) -| Dec 06 | **Problem Set 9 due**| Difficulty: easy| +| Dec 02, Dec 04 | Machine Learning | [Notation and terminology](https://rafalab.dfci.harvard.edu/dsbook-part-2/ml/notation-and-terminology.html), [Evaluation Metrics](https://rafalab.dfci.harvard.edu/dsbook-part-2/ml/evaluation-metrics.html), [conditional probabilities](https://rafalab.dfci.harvard.edu/dsbook-part-2/ml/conditionals.html), [smoothing](https://rafalab.dfci.harvard.edu/dsbook-part-2/ml/smoothing.html) | | Dec 09, Dec 11 | Machine Learning | [Resampling methods](https://rafalab.dfci.harvard.edu/dsbook-part-2/ml/resampling-methods.html), [ML algorithms](https://rafalab.dfci.harvard.edu/dsbook-part-2/ml/algorithms.html), [ML in practice](https://rafalab.dfci.harvard.edu/dsbook-part-2/ml/ml-in-practice.html) | | Dec 13 | **Problem Set 10 due**| Difficulty: hard| | Dec 16, Dec 18 | Other topics | |
    DateSep 10Problem Set 1 due
    Sep 13Problem Set 2 due
    Sep 19Problem Set 3 due
    Sep 26Problem Set 4 due
    Sep 26Problem Set 5 due
    Oct 11Problem Set 6 due
    Oct 14 No class: Indigenous Peoples Day
    Oct 23Start final project, obtain approval for personal project.Start final project. Obtain approval if you want to do personal project instead.
    Nov 01Problem Set 7 due
    Nov 11 No class: Veterans’ Day
    Nov 01Problem Set 7 due
    Nov 22Problem Set 8 due
    Nov 25 Midterm 2
    Nov 27 No class: Thanksgiving Recess Begins
    Dec 06Problem Set 9 due
    Dec 13Problem Set 10 due
    Dec 20 Final Project due
    Sep 04 Productivity ToolsGetting Started, Installing R and RStudio, UnixInstalling R and RStudio on Windows or Mac, Getting Started Unix
    Sep 09Distributions, Dataviz Principles
    Sep 26Oct 04 Problem Set 5 due Difficulty: medium
    Oct 16 Midterm 1Cover material from Sep 04-Oct 09Covers material from Sep 04-Oct 11
    Oct 21, Oct 23Measurement Error Models, Treatment Effect Models, Association Tests, Association Not Causation
    Nov 01Problem Set 7 dueNov 15Problem Set 8 due Difficulty: hard
    Nov 22Problem Set 8 dueProblem Set 9 due Difficulty: easy
    Nov 25 Midterm 2Cover material from Sep 04-Nov 20Covers material from Sep 04-Nov 22
    Nov 27
    Dec 02, Dec 04 Machine LearningNotation and terminology, [Evaluation MetricsNotation and terminology, Evaluation Metrics, conditional probabilities, smoothing
    Dec 06Problem Set 9 dueDifficulty: easy
    Dec 09, Dec 11 Machine Learning Resampling methods, ML algorithms, ML in practice
    Dec 13 Problem Set 10 due Difficulty: hard
    Dec 16, Dec 18 Other topics
    Dec 20 Final Project due