-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
remove jargon and address reviewer critiques #198
Conversation
AppVeyor build 1.0.504 for commit 99fc8eb is now complete. Found 16 potential spelling error(s). Preview:content/02.intro.md:6:TODOcontent/02.intro.md:9:responder content/02.intro.md:9:responder content/02.intro.md:16:TODO content/03.combining-datasets.md:5:TODO content/03.combining-datasets.md:8:collinearity content/03.combining-datasets.md:8:collinearity content/03.combining-datasets.md:9:TODO content/04. box_1_experimental_design.md:28:subsampling content/06.model-complexity.md:4:TODO content/06.model-complexity.md:13:ac... |
AppVeyor build 1.0.513 for commit 61b27fa is now complete. Found 9 potential spelling error(s). Preview:content/02.intro.md:6:TODOcontent/02.intro.md:16:TODO content/03.combining-datasets.md:5:TODO content/03.combining-datasets.md:9:TODO content/06.model-complexity.md:4:TODO content/06.model-complexity.md:19:everytime content/06.model-complexity.md:33:accomodate content/07.prior-knowledge.md:19:HPO content/07.prior-knowledge.md:19:TODO... |
AppVeyor build 1.0.514 for commit c8c9b46 is now complete. Found 8 potential spelling error(s). Preview:content/02.intro.md:6:TODOcontent/02.intro.md:16:TODO content/03.combining-datasets.md:5:TODO content/03.combining-datasets.md:9:TODO content/06.model-complexity.md:17:everytime content/06.model-complexity.md:31:accomodate content/07.prior-knowledge.md:19:HPO content/07.prior-knowledge.md:19:TODO... |
AppVeyor build 1.0.515 for commit 3592321 is now complete. Found 8 potential spelling error(s). Preview:content/02.intro.md:6:TODOcontent/02.intro.md:16:TODO content/03.combining-datasets.md:5:TODO content/03.combining-datasets.md:9:TODO content/06.model-complexity.md:17:everytime content/06.model-complexity.md:29:accomodate content/07.prior-knowledge.md:19:HPO content/07.prior-knowledge.md:19:TODO... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mostly made suggested changes. There are a couple of comments for your consideration too. Looking good, though!
Co-authored-by: Robert Allaway <[email protected]>
Co-authored-by: Robert Allaway <[email protected]>
AppVeyor build 1.0.518 for commit af1867b is now complete. Found 7 potential spelling error(s). Preview:content/02.intro.md:6:TODOcontent/02.intro.md:16:TODO content/03.combining-datasets.md:5:TODO content/03.combining-datasets.md:9:TODO content/06.model-complexity.md:29:accomodate content/07.prior-knowledge.md:19:HPO content/07.prior-knowledge.md:19:TODO... |
AppVeyor build 1.0.519 for commit af1867b is now complete. Found 6 potential spelling error(s). Preview:content/02.intro.md:6:TODOcontent/02.intro.md:16:TODO content/03.combining-datasets.md:5:TODO content/03.combining-datasets.md:9:TODO content/07.prior-knowledge.md:19:HPO content/07.prior-knowledge.md:19:TODO... |
AppVeyor build 1.0.520 for commit 65e5b5b is now complete. Found 6 potential spelling error(s). Preview:content/02.intro.md:6:TODOcontent/02.intro.md:16:TODO content/03.combining-datasets.md:5:TODO content/03.combining-datasets.md:9:TODO content/07.prior-knowledge.md:19:HPO content/07.prior-knowledge.md:19:TODO... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Outside of the silver standard/label noise comments, I largely agree with @allaway's review. I am returning a few more suggestions I had beyond that.
Co-authored-by: Robert Allaway <[email protected]>
AppVeyor build 1.0.526 for commit 526d2e8 failed. |
Co-authored-by: Robert Allaway <[email protected]>
Co-authored-by: Jaclyn Taroni <[email protected]>
AppVeyor build 1.0.527 for commit 250e422 failed. |
Co-authored-by: Jaclyn Taroni <[email protected]>
AppVeyor build 1.0.528 for commit 3598611 failed. |
AppVeyor build 1.0.534 for commit 1a07fe2 is now complete. Found 6 potential spelling error(s). Preview:content/02.intro.md:6:TODOcontent/02.intro.md:16:TODO content/03.combining-datasets.md:5:TODO content/03.combining-datasets.md:9:TODO content/06.model-complexity.md:29:recalibrated content/07.prior-knowledge.md:19:TODO... |
AppVeyor build 1.0.535 for commit 9a345cd is now complete. Found 6 potential spelling error(s). Preview:content/02.intro.md:6:TODOcontent/02.intro.md:16:TODO content/03.combining-datasets.md:5:TODO content/03.combining-datasets.md:9:TODO content/06.model-complexity.md:29:recalibrated content/07.prior-knowledge.md:19:TODO... |
AppVeyor build 1.0.536 for commit 0eed1fc is now complete. Found 6 potential spelling error(s). Preview:content/02.intro.md:6:TODOcontent/02.intro.md:16:TODO content/03.combining-datasets.md:5:TODO content/03.combining-datasets.md:9:TODO content/06.model-complexity.md:29:recalibrated content/07.prior-knowledge.md:19:TODO... |
AppVeyor build 1.0.537 for commit af9b2da is now complete. Found 6 potential spelling error(s). Preview:content/02.intro.md:6:TODOcontent/02.intro.md:16:TODO content/03.combining-datasets.md:5:TODO content/03.combining-datasets.md:9:TODO content/06.model-complexity.md:29:recalibrated content/07.prior-knowledge.md:19:TODO... |
ok... I think this PR is now ready for review |
AppVeyor build 1.0.538 Found 6 potential spelling error(s). Preview:content/02.intro.md:6:TODOcontent/02.intro.md:16:TODO content/03.combining-datasets.md:5:TODO content/03.combining-datasets.md:9:TODO content/06.model-complexity.md:28:recalibrated content/07.prior-knowledge.md:19:TODO... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Returning some thoughts about the regularization section
content/06.model-complexity.md
Outdated
The presence of label-noise and sparsity in the data can also lead to overfitting of models to the training data, meaning that the models show high prediction accuracy on the training data but low prediction accuracy (and large prediction errors) on new evaluation data. | ||
Overfit models tend to rely on variables that are unique to the training data (for example, the calibration of the instrument that was used to generate the training data), and not generalizable to new data (e.g., data generated on the same instrument that has been recalibrated). [@doi:10.1073/pnas.1900654116] | ||
In such cases, regularization can not only protect ML models from poor generalizability caused by overfitting, but also reduce model complexity by reducing the feature space available for training. (Figure[@fig:2]C) | ||
Regularization is a collection of methods (generally regression methods) that can help reduce prediction errors by various approaches. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know that I agree that "regularization is a collection of methods (generally regression methods)" vs. it's the approach/concept of adding a penalty to increase generalizability or some other desirable property and control overfitting. So if we take a VAE as an example, use of KL divergence in the loss function is often considered/talked about as regularization.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this comment could be addressed by updating this sentence to be more general. I think using the regression examples is fine. But then maybe 1-2 sentences that talk about other scenarios at the end of the paragraph...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fair point..
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Regularization is a collection of methods (generally regression methods) that can help reduce prediction errors by various approaches. | |
Regularization is an approach by which a penalty is added to the model to avoid making large prediction errors. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jaclyn-taroni do you have an example where KL divergence or other types of regularizations may have been used in the context of rare disease or rare dataset? I would like to add a sentence towards the end of this section to prompt readers to explore regularization methods other than L1, L2 and elastic net.. but most of the rare disease/dataset papers I came across had a variant of the above mentioned regressions.
Co-authored-by: Jaclyn Taroni <[email protected]>
Co-authored-by: Jaclyn Taroni <[email protected]>
Co-authored-by: Jaclyn Taroni <[email protected]>
AppVeyor build 1.0.539 for commit 73d69b7 is now complete. Found 6 potential spelling error(s). Preview:content/02.intro.md:6:TODOcontent/02.intro.md:16:TODO content/03.combining-datasets.md:5:TODO content/03.combining-datasets.md:9:TODO content/06.model-complexity.md:28:recalibrated content/07.prior-knowledge.md:19:TODO... |
AppVeyor build 1.0.540 for commit 2906dac is now complete. Found 6 potential spelling error(s). Preview:content/02.intro.md:6:TODOcontent/02.intro.md:16:TODO content/03.combining-datasets.md:5:TODO content/03.combining-datasets.md:9:TODO content/06.model-complexity.md:28:recalibrated content/07.prior-knowledge.md:19:TODO... |
AppVeyor build 1.0.541 for commit e176d47 is now complete. Found 6 potential spelling error(s). Preview:content/02.intro.md:6:TODOcontent/02.intro.md:16:TODO content/03.combining-datasets.md:5:TODO content/03.combining-datasets.md:9:TODO content/06.model-complexity.md:28:recalibrated content/07.prior-knowledge.md:19:TODO... |
AppVeyor build 1.0.542 for commit ef7dc62 is now complete. Found 6 potential spelling error(s). Preview:content/02.intro.md:6:TODOcontent/02.intro.md:16:TODO content/03.combining-datasets.md:5:TODO content/03.combining-datasets.md:9:TODO content/06.model-complexity.md:28:recalibrated content/07.prior-knowledge.md:19:TODO... |
AppVeyor build 1.0.543 for commit 41879c0 is now complete. Found 6 potential spelling error(s). Preview:content/02.intro.md:6:TODOcontent/02.intro.md:16:TODO content/03.combining-datasets.md:5:TODO content/03.combining-datasets.md:9:TODO content/06.model-complexity.md:28:recalibrated content/07.prior-knowledge.md:19:TODO... |
AppVeyor build 1.0.544 for commit 98cd4e2 is now complete. Found 9 potential spelling error(s). Preview:content/02.intro.md:6:TODOcontent/02.intro.md:16:TODO content/03.combining-datasets.md:5:TODO content/03.combining-datasets.md:9:TODO content/06.model-complexity.md:28:recalibrated content/06.model-complexity.md:37:Kullback content/06.model-complexity.md:37:Leibler content/06.model-complexity.md:37:luekemia content/07.prior-knowledge.md:19:TODO... |
[ci skip] This build is based on fea055c. This commit was created by the following CI build and job: https://github.com/jaybee84/ml-in-rd/commit/fea055c4da4e3df79f0efee5279d845c96929616/checks https://github.com/jaybee84/ml-in-rd/runs/2007041646
[ci skip] This build is based on fea055c. This commit was created by the following CI build and job: https://github.com/jaybee84/ml-in-rd/commit/fea055c4da4e3df79f0efee5279d845c96929616/checks https://github.com/jaybee84/ml-in-rd/runs/2007041646
Purpose
closes #175
For review please refer to the comments linked in #199
Directions for reviewers
Which areas should receive a particularly close look?
Is there anything that you want to discuss further?
Is the pull request ready for review?
Yes
Manuscript checklist
Unless otherwise noted above, this PR will be considered ready for review when all four items have been checked.