Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[QUESTION] XGBModel fit method in Darts raises "sample_weight_eval_set's length does not equal eval_set's length" error when using val_sample_weight #2579

Open
YutakaJX opened this issue Nov 3, 2024 · 8 comments · May be fixed by #2626
Assignees
Labels
bug Something isn't working good first issue Good for newcomers

Comments

@YutakaJX
Copy link

YutakaJX commented Nov 3, 2024

Describe the issue linked to the documentation

I'm encountering an issue with Darts' XGBModel when trying to fit it with val_sample_weight provided.

Specifically, when using val_sample_weight, I receive a ValueError indicating that "sample_weight_eval_set's length does not equal eval_set's length", even though the shapes and lengths of the training and validation datasets appear to match.

Based on the Darts documentation for XGBModel, the fit method accepts series, past_covariates, sample_weight, val_series, val_past_covariates, and val_sample_weight, among others. While sample_weight works fine, adding val_sample_weight consistently raises an error.

Here’s a minimal code snippet that reproduces the issue (X is pd.DataFrame that contains covariates, targets and weights. So, splitting first into X_cov, y and w):

y = X.filter(regex='^Targets_')
w = X.filter(regex='^Weights_')
X_cov = X.filter(regex='^Covariates_')
y_train, y_val = train_test_split(y, test_size=0.4, shuffle=False)
w_train, w_val = train_test_split(w, test_size=0.4, shuffle=False)
X_cov_train, X_cov_val = train_test_split(X_cov, test_size=0.4, shuffle=False)

y_train_ts = TimeSeries.from_times_and_values(y_train.index, y_train.values, columns=y_train.columns)
w_train_ts = TimeSeries.from_times_and_values(w_train.index, w_train.values, columns=w_train.columns)
X_cov_train_ts = TimeSeries.from_times_and_values(X_cov_train.index, X_cov_train.values, columns=X_cov_train.columns)
y_val_ts = TimeSeries.from_times_and_values(y_val.index, y_val.values, columns=y_val.columns)
w_val_ts = TimeSeries.from_times_and_values(w_val.index, w_val.values, columns=w_val.columns)
X_cov_val_ts = TimeSeries.from_times_and_values(X_cov_val.index, X_cov_val.values, columns=X_cov_val.columns)

xgb_model = XGBModel(
    lags=16,
    lags_past_covariates=16,
    output_chunk_length=1,
    objective="binary:logistic",
    booster="gbtree",
)

xgb_model.fit(
    y_train_ts,
    past_covariates=X_cov_train_ts,
    val_series=y_val_ts, 
    val_past_covariates=X_cov_val_ts,
    sample_weight=w_train_ts,
    val_sample_weight=w_val_ts,
)

The above code produces the following error :

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Cell In[10], line 15
     12 w_val_ts = TimeSeries.from_times_and_values(w_val.index, w_val.values, columns=w_val.columns)
     13 X_cov_val_ts = TimeSeries.from_times_and_values(X_cov_val.index, X_cov_val.values, columns=X_cov_val.columns)
---> 15 xgb_model.fit(
     16     y_train_ts,
     17     past_covariates=X_cov_train_ts,
     18     val_series=y_val_ts, 
     19     val_past_covariates=X_cov_val_ts,
     20     sample_weight=w_train_ts,
     21     val_sample_weight=w_val_ts,
     22 )

(... lines omitted for brevity ...)

File ~/miniforge3/envs/testdev/lib/python3.9/site-packages/xgboost/sklearn.py:548, in _wrap_evaluation_matrices(missing, X, y, group, qid, sample_weight, base_margin, feature_weights, eval_set, sample_weight_eval_set, base_margin_eval_set, eval_group, eval_qid, create_dmatrix, enable_categorical, feature_types)
    545     return meta
    547 if eval_set is not None:
--> 548     sample_weight_eval_set = validate_or_none(
    549         sample_weight_eval_set, "sample_weight_eval_set"
    550     )
    551     base_margin_eval_set = validate_or_none(
    552         base_margin_eval_set, "base_margin_eval_set"
    553     )
    554     eval_group = validate_or_none(eval_group, "eval_group")

File ~/miniforge3/envs/testdev/lib/python3.9/site-packages/xgboost/sklearn.py:541, in _wrap_evaluation_matrices.<locals>.validate_or_none(meta, name)
    539     return [None] * n_validation
    540 if len(meta) != n_validation:
--> 541     raise ValueError(
    542         f"{name}'s length does not equal `eval_set`'s length, "
    543         + f"expecting {n_validation}, got {len(meta)}"
    544     )
    545 return meta

ValueError: sample_weight_eval_set's length does not equal `eval_set`'s length, expecting 1, got 175310

Additional context
If I give None as validation datasets, there is no problem to run fit.

xgb_model.fit(
    y_train_ts,
    past_covariates=X_cov_train_ts,
    val_series=None, 
    val_past_covariates=None,
    sample_weight=w_train_ts,
    val_sample_weight=None
)

If I only set val_sample_weight to None, there is no problem to run fit. So, the problem seems to happen only when I set val_sample_weight to TimeSeries.

xgb_model.fit(
    y_train_ts,
    past_covariates=X_cov_train_ts,
    val_series=y_val_ts, 
    val_past_covariates=X_cov_val_ts,
    sample_weight=w_train_ts,
    val_sample_weight=None,
)

In addition, the same issue does not occur when using LightGBMModel in a similar setup, suggesting a difference in how val_sample_weight is handled for XGBModel versus LightGBMModel.

Am I missing something in the code? Any help to resolve this issue would be greatly appreciated.

@YutakaJX YutakaJX added question Further information is requested triage Issue waiting for triaging labels Nov 3, 2024
@yunakkano
Copy link

yunakkano commented Nov 4, 2024

I think XGBModel should override _add_val_set_to_kwargs.

XGBModel currently runs _add_val_set_to_kwargs from RegressionModel. However in this method, unlike the one overridden for LightGBMModel, val_weights is assigned val_weight directly without wrapping it in a list. (in the else block)

def _add_val_set_to_kwargs(
        self,
        kwargs: Dict,
        val_series: Sequence[TimeSeries],
        val_past_covariates: Optional[Sequence[TimeSeries]],
        val_future_covariates: Optional[Sequence[TimeSeries]],
        val_sample_weight: Optional[Union[Sequence[TimeSeries], str]],
        max_samples_per_ts: int,
    ) -> dict:
        
.....(omit).....

        else:
            val_sets = [(val_samples, val_labels)]
            val_weights = val_weight    <---------------------------------- Here


        val_set_name, val_weight_name = self.val_set_params
        return dict(kwargs, **{val_set_name: val_sets, val_weight_name: val_weights})

This means that if val_weight is TimeSeries, val_weights will also be TimeSeries, not a list containing it.

So, adding _add_val_set_to_kwargs like below to XGBModel class might resolve this issue. (At least on my side, this worked fine. But, I need someone to confirm the workaround does not affect the other part of the class.)

def _add_val_set_to_kwargs(
        self,
        kwargs: Dict,
        val_series: Sequence[TimeSeries],
        val_past_covariates: Optional[Sequence[TimeSeries]],
        val_future_covariates: Optional[Sequence[TimeSeries]],
        val_sample_weight: Optional[Union[Sequence[TimeSeries], str]],
        max_samples_per_ts: int,
    ) -> dict:
        """Creates a validation set and returns a new set of kwargs passed to `self.model.fit()` including the
        validation set. This method can be overridden if the model requires a different logic to add the eval set."""
        val_samples, val_labels, val_weight = self._create_lagged_data(
            series=val_series,
            past_covariates=val_past_covariates,
            future_covariates=val_future_covariates,
            max_samples_per_ts=max_samples_per_ts,
            sample_weight=val_sample_weight,
            last_static_covariates_shape=self._static_covariates_shape,
        )
        # create validation sets for MultiOutputRegressor
        if val_labels.ndim == 2 and isinstance(self.model, MultiOutputRegressor):
            val_sets, val_weights = [], []
            for i in range(val_labels.shape[1]):
                val_sets.append((val_samples, val_labels[:, i]))
                if val_weight is not None:
                    val_weights.append(val_weight[:, i])
            val_weights = val_weights or None
        else:
            val_sets = [(val_samples, val_labels)]
            val_weights = [val_weight]    <---------------------------------- Here


        val_set_name, val_weight_name = self.val_set_params
        return dict(kwargs, **{val_set_name: val_sets, val_weight_name: val_weights})

The above code is actually the same one used for LightGBMModel.

@dennisbader dennisbader added bug Something isn't working good first issue Good for newcomers and removed question Further information is requested triage Issue waiting for triaging labels Nov 7, 2024
@dennisbader dennisbader added this to darts Nov 7, 2024
@github-project-automation github-project-automation bot moved this to To do in darts Nov 7, 2024
@dennisbader
Copy link
Collaborator

Thanks for reporting the issue @YutakaJX and also for the investigation @yunakkano. It looks indeed like a bug and I added it to our backlog. PR would always be welcome of course :)

@KylinSchmidt
Copy link

Hi, I am new to this project and find this issue interesting. Can I solve it?

@dennisbader
Copy link
Collaborator

Hi @KylinSchmidt. Of course, we'd be very happy about the contribution! :) I can assign this issue to you.

This guide explains all the steps for contributing.

If you need any help, let me know.

@dennisbader dennisbader moved this from To do to In progress in darts Nov 24, 2024
@dennisbader
Copy link
Collaborator

@KylinSchmidt, just wanted to check in on this issue. Are you still open to contribute?

@KylinSchmidt
Copy link

@dennisbader Yes, I’m open to contributing! I’m still interested in this issue and plan to address this issue soon.

@yunakkano
Copy link

@dennisbader @KylinSchmidt Hi, both. Apology for not responding for a while, and happy to hear that @KylinSchmidt could contribute to this issue as I am a bit too busy for my own projects.

After some more investigation, I think the fundamental solution to this issue is NOT overriding the _add_val_set_to_kwargs method in XGBModel but to just correct the method in RgressionModel which is the base model for all regression models.

In my opinion, the corection would be just to add list brackets to the val_weight variable in the _add_val_set_to_kwargs method of the RegressionModel as below:

def _add_val_set_to_kwargs(
        self,
        kwargs: Dict,
        val_series: Sequence[TimeSeries],
        val_past_covariates: Optional[Sequence[TimeSeries]],
        val_future_covariates: Optional[Sequence[TimeSeries]],
        val_sample_weight: Optional[Union[Sequence[TimeSeries], str]],
        max_samples_per_ts: int,
    ) -> dict:
... (omit) ...
        if val_labels.ndim == 2 and isinstance(self.model, MultiOutputRegressor):
            val_sets, val_weights = [], []
            for i in range(val_labels.shape[1]):
                val_sets.append((val_samples, val_labels[:, i]))
                if val_weight is not None:
                    val_weights.append(val_weight[:, i])
            val_weights = val_weights or None
        else:
            val_sets = [(val_samples, val_labels)]
            val_weights = [val_weight]    <---------------------------------- Here

This way, the _add_val_set_to_kwargs method in LightGBMModel could be removed.

@dennisbader
Appreciate your comments to my suggestion if you have any.

@KylinSchmidt
Copy link

@dennisbader Hi, I just wanted to follow up on the pull request I submitted last week (#2626) and check if there’s anything further I need to do.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working good first issue Good for newcomers
Projects
Status: In progress
Development

Successfully merging a pull request may close this issue.

4 participants