-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CI failures significantly increased with automl_timeseries_forecasting_with_pycaret.ipynb
#812
Comments
Re: Problem 1While it started working again, so this isn't exactly a blocker, maybe it would still be good to snatch the relevant dataset from https://data.4tu.nl/ into cratedb-datasets, and load it from there, when possible? |
Problem 2Just enumerating/sampling them, not investigating them at all. 1
Footnotes
|
Problem 3Once in a while, we also receive those errors on CI validation runs against > if not isinstance(obj, dask.dataframe.core.DataFrame):
E AttributeError: module 'dask.dataframe.core' has no attribute 'DataFrame'
/opt/hostedtoolcache/Python/3.11.11/x64/lib/python3.11/site-packages/sktime/datatypes/_adapter/dask_to_pd.py:137: AttributeError Inside here, there is also this to be observed, in this case also flagged as an ERROR: ERROR traitlets:__init__.py:98 Notebook JSON is invalid: Additional properties are not allowed ('metadata' was unexpected)
Failed validating 'additionalProperties' in stream:
On instance['cells'][22]['outputs'][3]:
{'metadata': {'nbreg': {'diff_ignore': ['/outputs']}},
'name': 'stdout',
'output_type': 'stream',
'text': 'Fitting 3 folds for each of 10 candidates, totalling 30 fits\n'}
WARNING traitlets:client.py:1234 No handler found for comm target 'dash' |
test.py::test_notebook[automl_timeseries_forecasting_with_pycaret.ipynb]
automl_timeseries_forecasting_with_pycaret.ipynb
Thoughts 1
[...] I am elevating this tracking ticket to
bug
The right opportunity sweet spot to improve the situation here might be modernizations in this regard, we need to do anyway: |
automl_timeseries_forecasting_with_pycaret.ipynb
automl_timeseries_forecasting_with_pycaret.ipynb
Problem 4
/path/to/python3.11/site-packages/pycaret/utils/generic.py:585: UserWarning: Traceback (most recent call last):
File " /path/to/python3.11/site-packages/pycaret/utils/generic.py", line 580, in _calculate_metric
calculated_metric = score_func(y_test, target, sample_weight=weights, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[...]
File " /path/to/python3.11/site-packages/sklearn/utils/_array_api.py", line 521, in _asarray_with_order
array = numpy.asarray(array, order=order, dtype=dtype)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ValueError: could not convert string to float: 'No' |
Thoughts 2I can only hope migrating to AutoGluon, paired with relevant spring cleaning, can lead to a better micro-product, even if it's only scoped to this little notebook right now. I dearly think anomaly detection and forecasting is an equally if not even more important topic for CrateDB on the ML domain, when compared against topics on LLMs that are listed on the same documentation page. |
About
We are observing elevated frequencies of test failures on this notebook, both through CI runs triggered by PRs, and nightly scheduled ones, so the notebook in its current form should be considered super flaky, at the latest with immediate effect.
Problem 0
ValueError: Input contains NaN.
#298Problem 1
-- https://github.com/crate/cratedb-examples/actions/runs/12819074099/job/35745960097?pr=811#step:6:1626
Evaluation
It looks like loading times of those resources are currently very high, so executing the notebook which acquires them runs into a timeout error. Sure enough, it also does not provide a good experience when actually using them.
References
crate-2.0.0
, which usesorjson
for JSON marshalling #811/cc @wierdvanderhaar, @simonprickett, @ckurze, @kneth
The text was updated successfully, but these errors were encountered: