-
Notifications
You must be signed in to change notification settings - Fork 54
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Large dataset error #53
Comments
This is because of a wrong condition I used in a previous version of pydivsufsort. I used to check That being said, your loss still looks very large. Did you actually normalize inputs? |
of course i have normalized inputs, and i use these codes : from lassonet import LassoNetRegressorCV however, my input's shape is (20000,30000) |
The number of samples is irrelevant as the MSE has Did you test with the latest version? |
yes, i have tried the latest version; at the begining, the loss is normal; when the new fitting begin , the loss will be explosive : |
I think you are using an older version because the
and use |
Could you try to manually set lambda_start? To some larger value like 100. |
same error happened …… i think maybe is something related to the huge shape of dataset , i have tested that when the shape is (2000,3000), all the thing normal |
Can you post the logging output? |
Hey @louisabraham what else was changed in 0.0.15? After 0.0.15 LassoNetRegressor keeps returning 'None' for the lassoregressor model's state_dict, even though using the exact same settings 0.0.14 returns the model well. What were the updates between 14 and 15 in addition to the auto logging that could have caused this? |
Loss is 15310032732160.0 |
I am having the same problem, my dataset shape is (74,201376), I also tried the suggestions you gave above (install the latest version, set lambda_start lager, verbose=2), but I still get the error. model = LassoNetRegressorCV(lambda_start=500,verbose=2) File D:\Anaconda\envs\lassonet8262\Lib\site-packages\lassonet\interfaces.py:935, in BaseLassoNetCV.fit(self, X, y) File D:\Anaconda\envs\lassonet8262\Lib\site-packages\lassonet\interfaces.py:870, in BaseLassoNetCV.path(self, X, y, return_state_dicts) File D:\Anaconda\envs\lassonet8262\Lib\site-packages\lassonet\interfaces.py:471, in BaseLassoNet.path(self, X, y, X_val, y_val, lambda_seq, lambda_max, return_state_dicts, callback, disable_lambda_warning) File D:\Anaconda\envs\lassonet8262\Lib\site-packages\lassonet\interfaces.py:317, in BaseLassoNet.train(self, X_train, y_train, X_val, y_val, batch_size, epochs, lambda, optimizer, return_state_dict, patience) File D:\Anaconda\envs\lassonet8262\Lib\site-packages\torch\optim\optimizer.py:484, in Optimizer.profile_hook_step..wrapper(*args, **kwargs) File D:\Anaconda\envs\lassonet8262\Lib\site-packages\torch\optim\optimizer.py:89, in _use_grad_for_differentiable.._use_grad(self, *args, **kwargs) File D:\Anaconda\envs\lassonet8262\Lib\site-packages\torch\optim\sgd.py:112, in SGD.step(self, closure) File D:\Anaconda\envs\lassonet8262\Lib\site-packages\lassonet\interfaces.py:312, in BaseLassoNet._train..closure() AssertionError: |
are you able to share this dataset? or reproduce on a public dataset? Also, just a hinch but is your data in float64? |
I am honored to receive your reply, this dataset is pyrim from LIBSVM website, refering to some of the literature, I used polynominal(degree=5) to expand the data dimensions. I am learning feature selecting on ultra-high dimensional datasets. So, I have recently tried the Lassonet method to select features.
pyrim dataset's webiste: https://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/
噗哈
***@***.***
…------------------ 原始邮件 ------------------
发件人: "lasso-net/lassonet" ***@***.***>;
发送时间: 2024年8月28日(星期三) 晚上11:10
***@***.***>;
***@***.******@***.***>;
主题: Re: [lasso-net/lassonet] Large dataset error (Issue #53)
are you able to share this dataset? or reproduce on a public dataset?
—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you commented.Message ID: ***@***.***>
|
thank you very much, do you have maybe a MRE? How do you get 201376 samples? |
Hi @louisabraham, I am facing the same problem, my dataset shape is (149,9000), I also tried the suggestions you gave above (install the latest version. But getting the same error given below, Loss is inf
|
would you be able to share a (possibly mock) dataset where I can reproduce the error? |
just give me a code I can run with the data and I should be able to fix the issue. |
Sorry, I can't share the dataset and the code due to security reasons. Also, for smaller dataset(149,5370), after changing lambda_start_ and path_multiplier values, it is working but taking so long to train the model, like 12+ hours and still running. Could you please tell if this model normally takes that much time or it's abnormal in my case? And what values do you suggest of "hidden_dims", "lambda_start_" and "path_multiplier" will be ideal to train the dataset with (149, 9238)? Thank You !! |
my feature number is 30000, it get an error :
Loss is 511581280.0
Did you normalize input?
Choosing lambda with cross-validation: 0%| | 0/5 [01:12<?, ?it/s]
Traceback (most recent call last):
File "/opt/conda/lib/python3.10/site-packages/IPython/core/interactiveshell.py", line 3553, in run_code
exec(code_obj, self.user_global_ns, self.user_ns)
File "", line 3, in
path = model.fit( x, y)
File "/opt/conda/lib/python3.10/site-packages/lassonet/interfaces.py", line 744, in fit
self.path(X, y, return_state_dicts=False)
File "/opt/conda/lib/python3.10/site-packages/lassonet/interfaces.py", line 679, in path
path = super().path(
File "/opt/conda/lib/python3.10/site-packages/lassonet/interfaces.py", line 472, in path
last = self._train(
File "/opt/conda/lib/python3.10/site-packages/lassonet/interfaces.py", line 331, in _train
optimizer.step(closure)
File "/opt/conda/lib/python3.10/site-packages/torch/optim/optimizer.py", line 373, in wrapper
out = func(*args, **kwargs)
File "/opt/conda/lib/python3.10/site-packages/torch/optim/optimizer.py", line 76, in _use_grad
ret = func(self, *args, **kwargs)
File "/opt/conda/lib/python3.10/site-packages/torch/optim/sgd.py", line 66, in step
loss = closure()
File "/opt/conda/lib/python3.10/site-packages/lassonet/interfaces.py", line 326, in closure
assert False
AssertionError
however,when the feature number is 1000, it would not get this error
The text was updated successfully, but these errors were encountered: