Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support feature importance / variable selection #26

Open
JakeColtman opened this issue Oct 13, 2018 · 1 comment
Open

Support feature importance / variable selection #26

JakeColtman opened this issue Oct 13, 2018 · 1 comment

Comments

@JakeColtman
Copy link
Owner

In many real world use cases, it's important to be able to identity truly important features.

Implementing some of the approaches of https://repository.upenn.edu/cgi/viewcontent.cgi?article=1555&context=statistics_papers seems like a good start.

A side constraint is that the solution should be able to scale to large datasets, which might pose a problem for the permutation approach. Possibly it would be useful to have two different modes - a fully principled one and a rough and ready one for large data sets.

@JakeColtman
Copy link
Owner Author

Given the claims in the paper, it would be interesting for the solution to be general enough that it could be applied to implementations of models like RF in other libraries

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant