Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FIX roc_auc_curve: Return np.nan instead of 0.0 for single class #30103

Merged
merged 5 commits into from
Oct 29, 2024

Conversation

janezd
Copy link
Contributor

@janezd janezd commented Oct 18, 2024

Reference Issues/PRs

Fixes #30079.

What does this implement/fix? Explain your changes.

As discussed in #30079, it may be more appropriate to return np.nan when all data comes from a single class.

Copy link

github-actions bot commented Oct 18, 2024

✔️ Linting Passed

All linting checks passed. Your pull request is in excellent shape! ☀️

Generated for commit: 9ea9001. Link to the linter CI: here

Copy link
Member

@adrinjalali adrinjalali left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Otherwise LGTM.

with pytest.warns(UndefinedMetricWarning):
assert metric(y1_row, y2_row) == pytest.approx(0.0)
assert np.isnan(metric(y1_row, y2_row))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

math.isnan is preferable when working with numbers

@@ -1,3 +1,3 @@
- :func:`metrics.roc_auc_score` will now correctly return 0.0 and
- :func:`metrics.roc_auc_score` will now correctly return np.nan and
warn user if only one class is present in the labels.
By :user:`Gleb Levitski <glevv>`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Almost sure that you should add yourself to the changelog

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My "contribution" is rather trivial, but OK, I've added myself.

@@ -388,7 +389,8 @@ def test_roc_curve_toydata():
"ROC AUC score is not defined in that case."
)
with pytest.warns(UndefinedMetricWarning, match=expected_message):
roc_auc_score(y_true, y_score)
auc = roc_auc_score(y_true, y_score)
assert_almost_equal(auc, np.nan)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as above math.isnan is preferable

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missed one on 375th line

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. I amended the last commit.

Copy link
Member

@adrinjalali adrinjalali left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

otherwise LGTM

@@ -370,7 +371,8 @@ def test_roc_curve_toydata():
"ROC AUC score is not defined in that case."
)
with pytest.warns(UndefinedMetricWarning, match=expected_message):
roc_auc_score(y_true, y_score)
auc = roc_auc_score(y_true, y_score)
assert_almost_equal(auc, np.nan)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here with math.isnan

@adrinjalali adrinjalali added this to the 1.6 milestone Oct 23, 2024
warn user if only one class is present in the labels.
By :user:`Gleb Levitski <glevv>`
By :user:`Gleb Levitski <glevv>` and :user:`Janez Demšar <janezd>`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lesteve this PR number is not gonna show up here. What do we do?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can duplicate the changelog entry into two files with different PR numbers in their name but the same contents.

Then both entries will be merged by towncrier when we aggregate the changelog of a given release.

Copy link
Member

@lesteve lesteve Oct 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah that's the towncrier's way of doing it, creating two files with same content that only differ by the PR number in the filename.

I originally thought it was a hidden towncrier feature but it is mentioned in the tutorial

$ towncrier create --content 'Can also be ``rst`` as well!' 3456.doc.rst
# You can associate multiple issue numbers with a news fragment by giving them the same contents.
$ towncrier create --content 'Can also be ``rst`` as well!' 7890.doc.rst

I discovered this by looking at the towncrier source code. Note that it only works if both fragments have the same type (fix in this case) IIRC.

There is twisted/towncrier#599 to try to make it a bit more convenient.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@glemaitre you've enabled automerge before resolving this though.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I duplicated the changelog.

Copy link
Member

@ogrisel ogrisel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM as well once the review comments have all been taken care of.

@@ -375,12 +375,11 @@ def _binary_roc_auc_score(y_true, y_score, sample_weight=None, max_fpr=None):
warnings.warn(
(
"Only one class is present in y_true. ROC AUC score "
"is not defined in that case. The score is set to "
"0.0."
"is not defined in that case."
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be helpful to store np.unique(y_true) in a local variable and then display the value of that class label in the warning message.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To my understanding and exploration of the code, y_true can already be swapped at this point, so the warning would be misleading. (See #30079 (comment)).

I would prefer to keep this as it is, but if anybody wants to change it - go ahead. :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the idea here is different: we are not going to say that it is a positive or a negative class, we will just show the value that we found, i.e

f"Only one class is present in y_true: {y_true[0]!r}. "
f"ROC AUC score is not defined in that case."

Or something like that.

Copy link
Contributor Author

@janezd janezd Oct 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please let us stop this.

As I wrote, classes are renumerated. I know next to nothing about sklearn's code, but if caller gives y_true=[5, 2, 5] to roc_auc_score, the y_true that is passed to this internal function will be [1, 0, 1]. If caller gives [1, 1, 1], this function gets [0, 0, 0]. Any y_true composed of equal values is renumerated to y_true's composed of 0s.

If I change the function to

def _binary_roc_auc_score(y_true, y_score, sample_weight=None, max_fpr=None):
    if len(np.unique(y_true)) != 2:
        print("all values equal", y_true[0])
    return

I get the following

>>> import sklearn.metrics
sklearn.metrics.roc_auc_score([0, 0, 0], [0.3, 0.4, 0.5])
all values equal 0
>>> sklearn.metrics.roc_auc_score([1, 1, 1], [0.3, 0.4, 0.5])
all values equal 0
>>> sklearn.metrics.roc_auc_score([5, 5, 5], [0.3, 0.4, 0.5])
all values equal 0

To my possibly incomplete understanding of the code, len(np.unique(y_true)) != 2 may be equivalent to not np.any(y_true).

This function cannot give a more descriptive warning, because it doesn't have sufficient information for it. A better warning could be given by the function that calls it, but it would require a big refactoring because a (curried) _binary_roc_auc_score is given as an argument to _average_binary_score.

Changing a single 0.0 into np.nan and adding two lines of tests has taken me several hours spread across five days, mostly because of these warnings that could, imho, stay as they were. Not a great experience that I would want to repeat. :)

warn user if only one class is present in the labels.
By :user:`Gleb Levitski <glevv>`
By :user:`Gleb Levitski <glevv>` and :user:`Janez Demšar <janezd>`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can duplicate the changelog entry into two files with different PR numbers in their name but the same contents.

Then both entries will be merged by towncrier when we aggregate the changelog of a given release.

Copy link
Member

@glemaitre glemaitre left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@glemaitre glemaitre enabled auto-merge (squash) October 29, 2024 10:29
@glemaitre glemaitre disabled auto-merge October 29, 2024 10:57
@glemaitre glemaitre self-requested a review October 29, 2024 16:14
@glemaitre glemaitre enabled auto-merge (squash) October 29, 2024 16:26
@glemaitre glemaitre merged commit fff920e into scikit-learn:main Oct 29, 2024
28 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

roc_auc_score: incorrect result after merging #27412
6 participants