-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ss test mapie #41
base: dev
Are you sure you want to change the base?
Ss test mapie #41
Conversation
Changes are made in fs_algo_train_eval.py |
…ng calculation. Convert rf Bagging ci as a separate function
…ing calculation. Convert mlp Bagging ci as a separate function
…trapping runs from the yaml file
…strapping runs from the yaml file
…prediction algorithms pipeline to remove ci
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi Soroush,
These comments relate to some overall changes to design that won't be quick fixes. Given that, I'd suggest working on these revisions first in this same branch, then once pushed back into this PR I'll take a closer look in reviewing.
pkg/proc.attr.hydfab/.RData
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this snuck in before the .gitignore changes. Delete file, git add . , git commit, git push
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done!
mean_pred = predictions.mean(axis=0) | ||
std_pred = predictions.std(axis=0) | ||
|
||
ci_factors = {90: 1.645, 95: 1.96, 99: 2.576} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rather than manually specify these factors, let's use a function that can do it automatically, and for any confidence interval of interest.
https://stackoverflow.com/questions/55857722/how-to-calculate-a-confidence-interval-using-numpy-percentile-in-python
std_pred = predictions.std(axis=0) | ||
|
||
ci_factors = {90: 1.645, 95: 1.96, 99: 2.576} | ||
confidence_intervals = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some design requirements commentary after reflecting on what this multiple ci_factor will mean:
We'll want a way to communicate the confidence intervals that the user cares about. The user should specify the confidence interval in the algo configuration file. This could be a single value or multiple values that ultimately gets passed into the algorithms via Retr_Params
- perhaps we can add a section to this dict named uncertainty
. We'll want to keep track of the confidence intervals of interest for communicating results. This could mean tabular data and plots. This will be a little more challenging when accommodating multiple confidence intervals, but we could probably standardize how ci data look in a table, e.g.
column names of ci_90, ci_95, etc. We'll also need to make sure different plots that are generated have different filenames and titles specifying the confidence intervals.
In summary for this PR (prior to worrying about tables/plots) we first need a way to handle confidence intervals via config file, Retr_Params
, and how we track it in the data objects we generate.
# --- Calculate prediction intervals using MAPIE --- | ||
# mapie = MapieRegressor(rf, cv="prefit", agg_function="median") | ||
# mapie.fit(self.X_train, self.y_train) | ||
mapie = self.calculate_mapie(rf) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's try to generalize mapie further. Take it out of the individual algorithms and create a generalized function for prediction uncertainty that is called after the algorithm training. Here:
…r-specified confidence level
Implementation of MAPIE for rf and mlp models.
Additions
Removals
Changes
Testing
Screenshots
Notes
Todos
Checklist
Testing checklist
Target Environment support
Accessibility
Other