Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

remove jargon and address reviewer critiques #198

Merged
merged 26 commits into from
Mar 19, 2022

Conversation

jaybee84
Copy link
Owner

@jaybee84 jaybee84 commented Dec 15, 2021

Purpose

closes #175
For review please refer to the comments linked in #199

Directions for reviewers

Which areas should receive a particularly close look?

  • Was I able to remove all jargon?
  • Are the concepts clear as defined?

Is there anything that you want to discuss further?

  • do we need to add anything else?

Is the pull request ready for review?

Yes

Manuscript checklist

Unless otherwise noted above, this PR will be considered ready for review when all four items have been checked.

@AppVeyorBot
Copy link

AppVeyor build 1.0.504 for commit 99fc8eb is now complete.

Found 16 potential spelling error(s). Preview:content/02.intro.md:6:TODO
content/02.intro.md:9:responder
content/02.intro.md:9:responder
content/02.intro.md:16:TODO
content/03.combining-datasets.md:5:TODO
content/03.combining-datasets.md:8:collinearity
content/03.combining-datasets.md:8:collinearity
content/03.combining-datasets.md:9:TODO
content/04. box_1_experimental_design.md:28:subsampling
content/06.model-complexity.md:4:TODO
content/06.model-complexity.md:13:ac...
The rendered manuscript from this build is temporarily available for download at:

@jaybee84 jaybee84 marked this pull request as ready for review February 1, 2022 18:54
@AppVeyorBot
Copy link

AppVeyor build 1.0.513 for commit 61b27fa is now complete.

Found 9 potential spelling error(s). Preview:content/02.intro.md:6:TODO
content/02.intro.md:16:TODO
content/03.combining-datasets.md:5:TODO
content/03.combining-datasets.md:9:TODO
content/06.model-complexity.md:4:TODO
content/06.model-complexity.md:19:everytime
content/06.model-complexity.md:33:accomodate
content/07.prior-knowledge.md:19:HPO
content/07.prior-knowledge.md:19:TODO...
The rendered manuscript from this build is temporarily available for download at:

@AppVeyorBot
Copy link

AppVeyor build 1.0.514 for commit c8c9b46 is now complete.

Found 8 potential spelling error(s). Preview:content/02.intro.md:6:TODO
content/02.intro.md:16:TODO
content/03.combining-datasets.md:5:TODO
content/03.combining-datasets.md:9:TODO
content/06.model-complexity.md:17:everytime
content/06.model-complexity.md:31:accomodate
content/07.prior-knowledge.md:19:HPO
content/07.prior-knowledge.md:19:TODO...
The rendered manuscript from this build is temporarily available for download at:

@AppVeyorBot
Copy link

AppVeyor build 1.0.515 for commit 3592321 is now complete.

Found 8 potential spelling error(s). Preview:content/02.intro.md:6:TODO
content/02.intro.md:16:TODO
content/03.combining-datasets.md:5:TODO
content/03.combining-datasets.md:9:TODO
content/06.model-complexity.md:17:everytime
content/06.model-complexity.md:29:accomodate
content/07.prior-knowledge.md:19:HPO
content/07.prior-knowledge.md:19:TODO...
The rendered manuscript from this build is temporarily available for download at:

Copy link
Collaborator

@allaway allaway left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly made suggested changes. There are a couple of comments for your consideration too. Looking good, though!

content/06.model-complexity.md Outdated Show resolved Hide resolved
content/06.model-complexity.md Outdated Show resolved Hide resolved
content/06.model-complexity.md Outdated Show resolved Hide resolved
content/06.model-complexity.md Outdated Show resolved Hide resolved
content/06.model-complexity.md Outdated Show resolved Hide resolved
content/06.model-complexity.md Outdated Show resolved Hide resolved
content/06.model-complexity.md Outdated Show resolved Hide resolved
content/06.model-complexity.md Outdated Show resolved Hide resolved
content/06.model-complexity.md Show resolved Hide resolved
content/06.model-complexity.md Outdated Show resolved Hide resolved
jaybee84 and others added 2 commits February 3, 2022 10:28
@AppVeyorBot
Copy link

AppVeyor build 1.0.518 for commit af1867b is now complete.

Found 7 potential spelling error(s). Preview:content/02.intro.md:6:TODO
content/02.intro.md:16:TODO
content/03.combining-datasets.md:5:TODO
content/03.combining-datasets.md:9:TODO
content/06.model-complexity.md:29:accomodate
content/07.prior-knowledge.md:19:HPO
content/07.prior-knowledge.md:19:TODO...
The rendered manuscript from this build is temporarily available for download at:

@AppVeyorBot
Copy link

AppVeyor build 1.0.519 for commit af1867b is now complete.

Found 6 potential spelling error(s). Preview:content/02.intro.md:6:TODO
content/02.intro.md:16:TODO
content/03.combining-datasets.md:5:TODO
content/03.combining-datasets.md:9:TODO
content/07.prior-knowledge.md:19:HPO
content/07.prior-knowledge.md:19:TODO...
The rendered manuscript from this build is temporarily available for download at:

@AppVeyorBot
Copy link

AppVeyor build 1.0.520 for commit 65e5b5b is now complete.

Found 6 potential spelling error(s). Preview:content/02.intro.md:6:TODO
content/02.intro.md:16:TODO
content/03.combining-datasets.md:5:TODO
content/03.combining-datasets.md:9:TODO
content/07.prior-knowledge.md:19:HPO
content/07.prior-knowledge.md:19:TODO...
The rendered manuscript from this build is temporarily available for download at:

Copy link
Collaborator

@jaclyn-taroni jaclyn-taroni left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Outside of the silver standard/label noise comments, I largely agree with @allaway's review. I am returning a few more suggestions I had beyond that.

content/06.model-complexity.md Outdated Show resolved Hide resolved
content/06.model-complexity.md Outdated Show resolved Hide resolved
content/06.model-complexity.md Outdated Show resolved Hide resolved
content/06.model-complexity.md Outdated Show resolved Hide resolved
@AppVeyorBot
Copy link

AppVeyor build 1.0.526 for commit 526d2e8 failed.

@AppVeyorBot
Copy link

AppVeyor build 1.0.527 for commit 250e422 failed.

@AppVeyorBot
Copy link

AppVeyor build 1.0.528 for commit 3598611 failed.

@AppVeyorBot
Copy link

AppVeyor build 1.0.534 for commit 1a07fe2 is now complete.

Found 6 potential spelling error(s). Preview:content/02.intro.md:6:TODO
content/02.intro.md:16:TODO
content/03.combining-datasets.md:5:TODO
content/03.combining-datasets.md:9:TODO
content/06.model-complexity.md:29:recalibrated
content/07.prior-knowledge.md:19:TODO...
The rendered manuscript from this build is temporarily available for download at:

@AppVeyorBot
Copy link

AppVeyor build 1.0.535 for commit 9a345cd is now complete.

Found 6 potential spelling error(s). Preview:content/02.intro.md:6:TODO
content/02.intro.md:16:TODO
content/03.combining-datasets.md:5:TODO
content/03.combining-datasets.md:9:TODO
content/06.model-complexity.md:29:recalibrated
content/07.prior-knowledge.md:19:TODO...
The rendered manuscript from this build is temporarily available for download at:

@AppVeyorBot
Copy link

AppVeyor build 1.0.536 for commit 0eed1fc is now complete.

Found 6 potential spelling error(s). Preview:content/02.intro.md:6:TODO
content/02.intro.md:16:TODO
content/03.combining-datasets.md:5:TODO
content/03.combining-datasets.md:9:TODO
content/06.model-complexity.md:29:recalibrated
content/07.prior-knowledge.md:19:TODO...
The rendered manuscript from this build is temporarily available for download at:

@AppVeyorBot
Copy link

AppVeyor build 1.0.537 for commit af9b2da is now complete.

Found 6 potential spelling error(s). Preview:content/02.intro.md:6:TODO
content/02.intro.md:16:TODO
content/03.combining-datasets.md:5:TODO
content/03.combining-datasets.md:9:TODO
content/06.model-complexity.md:29:recalibrated
content/07.prior-knowledge.md:19:TODO...
The rendered manuscript from this build is temporarily available for download at:

@jaybee84
Copy link
Owner Author

ok... I think this PR is now ready for review

@AppVeyorBot
Copy link

AppVeyor build 1.0.538

Found 6 potential spelling error(s). Preview:content/02.intro.md:6:TODO
content/02.intro.md:16:TODO
content/03.combining-datasets.md:5:TODO
content/03.combining-datasets.md:9:TODO
content/06.model-complexity.md:28:recalibrated
content/07.prior-knowledge.md:19:TODO...
for commit 2930e79 is now complete. The rendered manuscript from this build is temporarily available for download at:

Copy link
Collaborator

@jaclyn-taroni jaclyn-taroni left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Returning some thoughts about the regularization section

content/06.model-complexity.md Outdated Show resolved Hide resolved
The presence of label-noise and sparsity in the data can also lead to overfitting of models to the training data, meaning that the models show high prediction accuracy on the training data but low prediction accuracy (and large prediction errors) on new evaluation data.
Overfit models tend to rely on variables that are unique to the training data (for example, the calibration of the instrument that was used to generate the training data), and not generalizable to new data (e.g., data generated on the same instrument that has been recalibrated). [@doi:10.1073/pnas.1900654116]
In such cases, regularization can not only protect ML models from poor generalizability caused by overfitting, but also reduce model complexity by reducing the feature space available for training. (Figure[@fig:2]C)
Regularization is a collection of methods (generally regression methods) that can help reduce prediction errors by various approaches.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know that I agree that "regularization is a collection of methods (generally regression methods)" vs. it's the approach/concept of adding a penalty to increase generalizability or some other desirable property and control overfitting. So if we take a VAE as an example, use of KL divergence in the loss function is often considered/talked about as regularization.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this comment could be addressed by updating this sentence to be more general. I think using the regression examples is fine. But then maybe 1-2 sentences that talk about other scenarios at the end of the paragraph...

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair point..

Copy link
Owner Author

@jaybee84 jaybee84 Mar 16, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Regularization is a collection of methods (generally regression methods) that can help reduce prediction errors by various approaches.
Regularization is an approach by which a penalty is added to the model to avoid making large prediction errors.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jaclyn-taroni do you have an example where KL divergence or other types of regularizations may have been used in the context of rare disease or rare dataset? I would like to add a sentence towards the end of this section to prompt readers to explore regularization methods other than L1, L2 and elastic net.. but most of the rare disease/dataset papers I came across had a variant of the above mentioned regressions.

content/06.model-complexity.md Outdated Show resolved Hide resolved
@jaybee84 jaybee84 requested a review from jaclyn-taroni March 16, 2022 19:22
@AppVeyorBot
Copy link

AppVeyor build 1.0.539 for commit 73d69b7 is now complete.

Found 6 potential spelling error(s). Preview:content/02.intro.md:6:TODO
content/02.intro.md:16:TODO
content/03.combining-datasets.md:5:TODO
content/03.combining-datasets.md:9:TODO
content/06.model-complexity.md:28:recalibrated
content/07.prior-knowledge.md:19:TODO...
The rendered manuscript from this build is temporarily available for download at:

@AppVeyorBot
Copy link

AppVeyor build 1.0.540 for commit 2906dac is now complete.

Found 6 potential spelling error(s). Preview:content/02.intro.md:6:TODO
content/02.intro.md:16:TODO
content/03.combining-datasets.md:5:TODO
content/03.combining-datasets.md:9:TODO
content/06.model-complexity.md:28:recalibrated
content/07.prior-knowledge.md:19:TODO...
The rendered manuscript from this build is temporarily available for download at:

@AppVeyorBot
Copy link

AppVeyor build 1.0.541 for commit e176d47 is now complete.

Found 6 potential spelling error(s). Preview:content/02.intro.md:6:TODO
content/02.intro.md:16:TODO
content/03.combining-datasets.md:5:TODO
content/03.combining-datasets.md:9:TODO
content/06.model-complexity.md:28:recalibrated
content/07.prior-knowledge.md:19:TODO...
The rendered manuscript from this build is temporarily available for download at:

@AppVeyorBot
Copy link

AppVeyor build 1.0.542 for commit ef7dc62 is now complete.

Found 6 potential spelling error(s). Preview:content/02.intro.md:6:TODO
content/02.intro.md:16:TODO
content/03.combining-datasets.md:5:TODO
content/03.combining-datasets.md:9:TODO
content/06.model-complexity.md:28:recalibrated
content/07.prior-knowledge.md:19:TODO...
The rendered manuscript from this build is temporarily available for download at:

@AppVeyorBot
Copy link

AppVeyor build 1.0.543 for commit 41879c0 is now complete.

Found 6 potential spelling error(s). Preview:content/02.intro.md:6:TODO
content/02.intro.md:16:TODO
content/03.combining-datasets.md:5:TODO
content/03.combining-datasets.md:9:TODO
content/06.model-complexity.md:28:recalibrated
content/07.prior-knowledge.md:19:TODO...
The rendered manuscript from this build is temporarily available for download at:

@AppVeyorBot
Copy link

AppVeyor build 1.0.544 for commit 98cd4e2 is now complete.

Found 9 potential spelling error(s). Preview:content/02.intro.md:6:TODO
content/02.intro.md:16:TODO
content/03.combining-datasets.md:5:TODO
content/03.combining-datasets.md:9:TODO
content/06.model-complexity.md:28:recalibrated
content/06.model-complexity.md:37:Kullback
content/06.model-complexity.md:37:Leibler
content/06.model-complexity.md:37:luekemia
content/07.prior-knowledge.md:19:TODO...
The rendered manuscript from this build is temporarily available for download at:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Revisions: Manage model complexity while preserving value of ML
4 participants