Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Installation of version 2.1 and errors in spvim() #6

Open
Tim-Re opened this issue May 8, 2022 · 9 comments
Open

Installation of version 2.1 and errors in spvim() #6

Tim-Re opened this issue May 8, 2022 · 9 comments

Comments

@Tim-Re
Copy link

Tim-Re commented May 8, 2022

Hi I'm currently trying to use spvim() from vimpy for its ability to accomodate arbitrary prediction functions as oppose to sp_vim() in R where as far as I can see only learners from the SL library can be used. When trying to install version 2.1 I however encountered the following error:
ERROR: Could not find a version that satisfies the requirement scipy.stats (from vimpy) (from versions: none)
ERROR: No matching distribution found for scipy.stats

This seems to be due to the 'scipy.stats' in line 20 in the file "setup.py". Maybe it is there for a reason but after removing it the installation worked fine.

Additionally when using the function spvim() I also encountered a few errors. It could also be that I'm using it in a wrong way however a few potential errors (which unfortunately did not entirely resolve the problems) in vimpy/vimpy/spvim.py are:

for method get_influence_function():

  • [line 109] in self.v.shape[0] the underscore after the v is missing => self.v_.shape[0]
    after including the underscore:
  • [line 109] in self.v_[self.v_.shape[0]] the index is out of bound maybe this should be self.v_[self.v_.shape[0]-1]?
  • [line 110] self.z_counts_ does not exist probably needs to be instantiated under init and defined during get_point_est()?

After incorporating these changes get_influence_function() worked, however, the methods get_ses() and get_cis() seem to have further issues i.e. problems with the indices in the shapley_se() function etc.

While I'm not sure about all of the above propositions they might still be of some use.

Kind regards.

@bdwilliamson
Copy link
Owner

Thanks for using the package!

I'm unclear what you mean by "arbitrary prediction functions" -- the Super Learner does allow you to use a wide variety of candidate prediction functions (you can see all of them using SuperLearner::listWrappers()), and you can use single algorithms in vimp::sp_vim by specifying a single character value (e.g., SL.library = "SL.glm"). You can also specify tuning parameters, etc. using the Super Learner; see SuperLearner::createLearner for more details.

Thanks also for looking at the python package. I'm not sure why you're having issues with setup.py (which works fine on my machine; this is really for sending to PyPI, not for installing the package locally), but I'm glad that you found a workaround.

Can you provide a minimum working example so that I can see what's going on in spvim()? I fixed the indexing and z_counts_ issue (thanks!).

@Tim-Re
Copy link
Author

Tim-Re commented May 11, 2022

Hi thanks for the answer.

By arbitrary prediction functions, I really just meant that in Python spvim() only seems to ask for any learner with a fit and predict method, while in R one is bound to the SuperLearner framework. However as it turns out I also have underestimated the capabilities of SuperLearner.

Regarding the setup.py issue it might be worth noting that the error message did not just occur on my local computer but also on google colab. However, when installing version 2.0.2.2, which does not include 'scipy_stats' in the install_requires list, the installation worked fine. Pip also seems to check the dependencies according to the provided install_requires list and installs missing packages if necessary.
I would then imagine that it tries to install scipy.stats which throws the same error as trying to install vimpy 2.1 and should already be available through scipy.
Right now, when not specifing a version, pip automatically installs vimpy 2.0.2.2 instead of 2.1.

In colab:
!pip install vimpy==2.1 #error occurs
!pip install git+https://github.com/bdwilliamson/vimpy #error occurs
!pip install vimpy==2.0.2.2 #error does not occur

A small example for the issue with spvim() is given below. Attached is also a screenshot of the error message triggered by vimpy_obj.get_ses().

import numpy as np
import vimpy
import pkg_resources
from sklearn.linear_model import LinearRegression
pckg = pkg_resources.get_distribution("vimpy")
print(pckg.version) #2.1

def lm(n):
mean = np.zeros(3)
cov = np.eye(3)
X = np.random.default_rng().multivariate_normal(mean, cov, n)
x1 = X[:,0]
x2 = X[:,1]
x3 = X[:,2]
f = x1 + 2*x2 - x3
y = f + np.random.normal(0,1,n)
return y, X

y,x = lm(1000)
model = LinearRegression()
vimpy_obj = vimpy.spvim(y = y, x = x, V = 5, pred_func = model, measure_type = "r_squared")

vimpy_obj.get_point_est()
vimpy_obj.get_influence_functions()
vimpy_obj.get_ses()
vimpy_obj.get_cis()

image

@bdwilliamson
Copy link
Owner

Thanks for the MWE. I've just completed a patch that should solve your issue (on GitHub, not PyPI yet). Please let me know if you're still seeing problems!

@Tim-Re
Copy link
Author

Tim-Re commented May 12, 2022

Thanks for the ammendments. Unfortunately using the MWE from above (or the example on the vimpy git page) there seems to be another error in get_ses() for var_s = np.nanvar(shapley_ics['contrib_s'][idx, :]) in vimpy/spvim_ic.py line 50.

image

@bdwilliamson
Copy link
Owner

I'm not getting that error when I use the latest version of the package on GitHub. Can you try updating vimpy using python -m pip install git+https://github.com/bdwilliamson/vimpy.git@aef6b90dbaa77d9a9dce9a45b4786b37a294c36c and re-running the MWE?

@Tim-Re
Copy link
Author

Tim-Re commented May 12, 2022

I've reinstalled the version of the most recent commit hash on both my local machine as well as on colab. The error unfortunately persists. What seems odd to me is that the .dtype attribute is different between contrib_s and contrib_v (see image below). I've also tried different versions of numpy which did not help.

image

@bdwilliamson
Copy link
Owner

Ok I've made the dtypes of the two the same (should both be float64). I've also bumped the version number to 2.1.1, so you should be able to confirm that this version is installed. Other than that, I'm not sure how to help, since I'm not seeing any errors when running the MWE on my machine (Python 3.8).

@Tim-Re
Copy link
Author

Tim-Re commented May 12, 2022

Thanks a lot for the efforts and the quick responses! After the dtype change it now works.

(As a small final and hopefully not annoying sidenote: in get_cis() the interval is assigned to self.ci_, however, under init it is self.cis_ so the confidence intervals are not returned in the end.)

@bdwilliamson
Copy link
Owner

Thank you for your help finding these bugs! I really appreciate your patience. Just fixed that last bug as well, I'll try to get a release to PyPI soon.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants