Skip to content

Commit

Permalink
27_basic_regression
Browse files Browse the repository at this point in the history
  • Loading branch information
B7M committed Jun 25, 2024
1 parent 5e48790 commit 5da6de5
Show file tree
Hide file tree
Showing 11 changed files with 221 additions and 45 deletions.
108 changes: 98 additions & 10 deletions slides/summer_institute/27_basic_regression_pytorch.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,17 @@
"Let's perform basic regression using PyTorch, alongside other useful libraries like pandas, statsmodels, seaborn, and matplotlib for data handling and visualization."
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"## Code"
]
},
{
"cell_type": "code",
"execution_count": 17,
Expand All @@ -35,7 +46,7 @@
"id": "6nKZWgTDvTTK",
"outputId": "b31d9b45-3526-4922-f687-6ead41d98990",
"slideshow": {
"slide_type": "subslide"
"slide_type": "fragment"
},
"tags": []
},
Expand Down Expand Up @@ -189,6 +200,17 @@
"This code snippet shows the initial steps: importing the necessary libraries, loading the dataset, and displaying the first four rows. This helps us get a quick overview of the data we’ll be working with for regression analysis."
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"## Plot the data"
]
},
{
"cell_type": "code",
"execution_count": 18,
Expand All @@ -201,7 +223,7 @@
"id": "LtzlgJCu4zbX",
"outputId": "8d58679e-d010-4145-c01e-272b05542f61",
"slideshow": {
"slide_type": "subslide"
"slide_type": "fragment"
},
"tags": []
},
Expand Down Expand Up @@ -234,6 +256,17 @@
"To begin our analysis, let's visualize the relationship between two variables in our dataset using a scatter plot. We'll use Seaborn, a powerful library for creating informative and attractive statistical graphics in Python. This code creates a scatter plot of the T2 variable on the x-axis and the PD variable on the y-axis. This visual representation helps us observe any potential correlation or pattern between these two variables. Scatter plots are a great way to explore the relationships in our data before diving into regression analysis."
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"## Basic regression results"
]
},
{
"cell_type": "code",
"execution_count": 19,
Expand All @@ -246,7 +279,7 @@
"id": "ehiVfmYHJ4EL",
"outputId": "b671b7a2-a869-46ab-88da-aa1fade74ae1",
"slideshow": {
"slide_type": "subslide"
"slide_type": "fragment"
},
"tags": []
},
Expand Down Expand Up @@ -282,6 +315,17 @@
"The summary provides important details like the coefficients, standard errors, t-values, and p-values for the predictors. This information helps us understand the relationship between T2 and PD, and assess the significance of the predictor."
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"## Visualization of the line"
]
},
{
"cell_type": "code",
"execution_count": 20,
Expand All @@ -294,7 +338,7 @@
"id": "F6gUNlf6LbJc",
"outputId": "dc8f04ff-b6a7-4460-abc9-bedcea4077b0",
"slideshow": {
"slide_type": "slide"
"slide_type": "fragment"
},
"tags": []
},
Expand Down Expand Up @@ -342,6 +386,17 @@
"Next, we’ll visualize the in-sample predictions from our linear regression model and compare them to the actual outcomes. This plot helps us assess how well our model's predictions align with the actual data. Ideally, the points should lie close to the reference line, indicating good predictive performance."
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"# Create tensors "
]
},
{
"cell_type": "code",
"execution_count": 21,
Expand All @@ -354,7 +409,7 @@
"id": "9LZ4MgGEPxEN",
"outputId": "48c41eae-7b0b-4479-e613-7af18724ddce",
"slideshow": {
"slide_type": "slide"
"slide_type": "fragment"
},
"tags": []
},
Expand Down Expand Up @@ -406,6 +461,17 @@
"This preparation is crucial for ensuring the data is in the correct format for training a neural network in PyTorch."
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"## Linear regression in PyTorch"
]
},
{
"cell_type": "code",
"execution_count": 22,
Expand All @@ -414,7 +480,7 @@
"colab_type": "code",
"id": "OZKrXwTjPdrB",
"slideshow": {
"slide_type": "subslide"
"slide_type": "fragment"
},
"tags": []
},
Expand Down Expand Up @@ -451,6 +517,17 @@
"This process illustrates how linear regression can be implemented and trained using PyTorch, leveraging its powerful automatic differentiation and optimization capabilities."
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"## Visualization "
]
},
{
"cell_type": "code",
"execution_count": 23,
Expand All @@ -463,7 +540,7 @@
"id": "fd1wjXgukgqs",
"outputId": "c46bf0da-1478-4fff-8866-992019038021",
"slideshow": {
"slide_type": "subslide"
"slide_type": "fragment"
},
"tags": []
},
Expand Down Expand Up @@ -505,12 +582,23 @@
}
},
"source": [
"We then compare the predictions from our PyTorch model with those from our statsmodels linear regression model. To do this, we generate predictions from the PyTorch model using model(xtraining).detach().numpy().reshape(-1) to convert the tensor to a NumPy array.\n",
"We create a scatter plot comparing the PyTorch predictions (ytest) against the statsmodels predictions (yhat).\n",
"We then compare the predictions from our PyTorch model with those from our statsmodels linear regression model. To do this, we generate predictions from the PyTorch model using model to convert the tensor to a NumPy array.\n",
"We create a scatter plot comparing the PyTorch predictions against the statsmodels predictions.\n",
"\n",
"This comparison helps us validate the consistency of predictions between the two models. Ideally, the points should lie close to the reference line, indicating that both models are producing similar predictions."
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"## PyTorch parameters for linear regression "
]
},
{
"cell_type": "code",
"execution_count": 24,
Expand All @@ -523,7 +611,7 @@
"id": "hZpPdBPYy53z",
"outputId": "f2b4a179-809a-41c8-c2e2-2c205d5175c2",
"slideshow": {
"slide_type": "subslide"
"slide_type": "fragment"
},
"tags": []
},
Expand Down
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Loading

0 comments on commit 5da6de5

Please sign in to comment.