-
Notifications
You must be signed in to change notification settings - Fork 111
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Steps for features aggregation #988
Comments
As far as I can tell is something you could put into a recipe steps, and there would be some need to do this because there are domains that use these methods a lot. I don't think such methods would be appropriate in {recipes}, they have some {embed} flavor but right now I feel (not strongly) they would go best in a feature aggregation extension package. I would need to do more reading to make sure, but an essential part of a recipe step is to be able to reapply the same transformation that was applied to the training data set to other datasets, and I'm not sure if these methods are re-apply-able. |
Two years later, I've finally created a package that incorporates these types of steps (but not only, it's dedicated to omics data). |
@abichat that looks very interesting! thank you for cross posting. I'll do my best to take a look at the package next week! |
This is very exciting! I'm going to close this issue as I think this is a good solution to problem raised in this issue. Feel free to add any issues here for changes in {recipes} that would make your life easier. I'm going to add a couple of issues with some comments I have |
Thank you so much for your time and feedback! |
This issue has been automatically locked. If you believe you have found a related problem, please file a new issue (with a reprex https://reprex.tidyverse.org) and link to this issue. |
Feature
In situations when there are lot of correlated features, like in transciptomics data, it is usual to create aggregated score for each group of correlated variables (sometimes called modules, sets, metagenes...). These summary scores could be computed as a mean, a z-scores, a eigenvalue... of the features in the module. There are some examples of scores here.
To complete, modules can be obtained by other ways (other steps), like given by an a priori knowledge or WGCNA algorithm.
What do you think? This new steps could be included in
recipes
,embed
or a new package dedicated to feature aggregation.Thank you for all your work!
The text was updated successfully, but these errors were encountered: