Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: Pandas resets counter when using filterwarning "once" #60664

Open
2 of 3 tasks
StrawberryOwl opened this issue Jan 6, 2025 · 4 comments
Open
2 of 3 tasks

BUG: Pandas resets counter when using filterwarning "once" #60664

StrawberryOwl opened this issue Jan 6, 2025 · 4 comments
Labels
Bug Closing Candidate May be closeable, needs more eyeballs Upstream issue Issue related to pandas dependency

Comments

@StrawberryOwl
Copy link

Pandas version checks

  • I have checked that this issue has not already been reported.

  • I have confirmed this bug exists on the latest version of pandas.

  • I have confirmed this bug exists on the main branch of pandas.

Reproducible Example

import warnings
import pandas as pd

warnings.filterwarnings("once", category=UserWarning)

warnings.warn("This is a warning", UserWarning)
warnings.warn("This is a warning", UserWarning)
warnings.warn("This is a second warning", UserWarning)
warnings.warn("This is a second warning", UserWarning)
pd.DataFrame()
warnings.warn("This is a warning", UserWarning)
warnings.warn("This is a warning", UserWarning)
warnings.warn("This is a second warning", UserWarning)
warnings.warn("This is a second warning", UserWarning)

Issue Description

Using filterwarnings with action 'once' should only print a warning of a specific category and text once. But calling pd.DataFrame() or other pandas functions (like pd.read_csv) makes both warnings shown twice. Deleting pd.DataFrame yields the expected behaviour.

I read issue #31978. This has been closed saying that it is a PyCharm issue, but I am using VSCode and I verified my example in termnial both from Windows and Ubuntu.

Expected Behavior

Both warnings ("This is a warning" and "This is a second warning") should be shown only once each.

Installed Versions

INSTALLED VERSIONS

commit : 0691c5c
python : 3.10.12
python-bits : 64
OS : Linux
OS-release : 5.15.153.1-microsoft-standard-WSL2
Version : #1 SMP Fri Mar 29 23:14:13 UTC 2024
machine : x86_64
processor : x86_64
byteorder : little
LC_ALL : None
LANG : C.UTF-8
LOCALE : en_US.UTF-8

pandas : 2.2.3
numpy : 2.1.2
pytz : 2024.2
dateutil : 2.9.0.post0
pip : 24.3.1
Cython : None
sphinx : None
IPython : None
adbc-driver-postgresql: None
adbc-driver-sqlite : None
bs4 : None
blosc : None
bottleneck : None
dataframe-api-compat : None
fastparquet : None
fsspec : None
html5lib : None
hypothesis : None
gcsfs : None
jinja2 : None
lxml.etree : None
matplotlib : 3.9.2
numba : None
numexpr : None
odfpy : None
openpyxl : None
pandas_gbq : None
psycopg2 : None
pymysql : None
pyarrow : 17.0.0
pyreadstat : None
pytest : None
python-calamine : None
pyxlsb : None
s3fs : None
scipy : None
sqlalchemy : None
tables : None
tabulate : None
xarray : None
xlrd : None
xlsxwriter : None
zstandard : None
tzdata : 2024.2
qtpy : None
pyqt5 : None

@StrawberryOwl StrawberryOwl added Bug Needs Triage Issue that has not been reviewed by a pandas team member labels Jan 6, 2025
@Tuckersteward
Copy link

Tuckersteward commented Jan 7, 2025

I am unable to reproduce this. I am using VS Code as well, but checked terminal as well.

My outputs are identical and appear to behave correctly between

import warnings

warnings.filterwarnings("once", category=UserWarning)

warnings.warn("This is a warning")
warnings.warn("This is a warning")
warnings.warn("This is a second warning")
warnings.warn("This is a second warning")

and

import pandas as pd

warnings.filterwarnings("once", category=UserWarning)

pd.DataFrame()
warnings.warn("This is a warning", UserWarning)
warnings.warn("This is a warning", UserWarning)
warnings.warn("This is a second warning", UserWarning)
warnings.warn("This is a second warning", UserWarning)

Do you have any other information? It could be tied to your version of python but I tested this on 3.10.11, 3.13.11 and 3.12.4

@StrawberryOwl
Copy link
Author

Did you run the code as you wrote it? Or did you run my example completely unchanged? If you just issue the warnings as you did, my output is the same as well. The problem appears if you raise the warnings before pd.DataFrame() AND after it.

Running my code WITH pd.DataFrame()

import warnings
import pandas as pd

warnings.filterwarnings("once", category=UserWarning)

warnings.warn("This is a warning", UserWarning)
warnings.warn("This is a warning", UserWarning)
warnings.warn("This is a second warning", UserWarning)
warnings.warn("This is a second warning", UserWarning)
pd.DataFrame()
warnings.warn("This is a warning", UserWarning)
warnings.warn("This is a warning", UserWarning)
warnings.warn("This is a second warning", UserWarning)
warnings.warn("This is a second warning", UserWarning)

results in

warningtest.py:6: UserWarning: This is a warning
  warnings.warn("This is a warning", UserWarning)
warningtest.py:8: UserWarning: This is a second warning
  warnings.warn("This is a second warning", UserWarning)
warningtest.py:11: UserWarning: This is a warning
  warnings.warn("This is a warning", UserWarning)
warningtest.py:13: UserWarning: This is a second warning
  warnings.warn("This is a second warning", UserWarning)

Running the example WITHOUT pd.DataFrame

import warnings
import pandas as pd

warnings.filterwarnings("once", category=UserWarning)

warnings.warn("This is a warning", UserWarning)
warnings.warn("This is a warning", UserWarning)
warnings.warn("This is a second warning", UserWarning)
warnings.warn("This is a second warning", UserWarning)
warnings.warn("This is a warning", UserWarning)
warnings.warn("This is a warning", UserWarning)
warnings.warn("This is a second warning", UserWarning)
warnings.warn("This is a second warning", UserWarning)

Results in:

warningtest.py:6: UserWarning: This is a warning
  warnings.warn("This is a warning", UserWarning)
warningtest.py:8: UserWarning: This is a second warning
  warnings.warn("This is a second warning", UserWarning)

The second output is as I expected filterwarnings to work. The docs say:

"once" print only the first occurrence of matching warnings, regardless of location

When I call a pandas function, the filter seems to have been reset, since both warnings are shown again, although they have been called before.

@Tuckersteward
Copy link

Ah I see it now.

Upon further investigation it appears the warnings registry is getting reset. The DataFrame only does this when it is creating a new dataframe if it is empty, if we create it with data we do not see this issue. While doing more testing I have found that merge has a similar issue but it always occurs.

You can view the issues with merge with the below code.

import warnings
import pandas as pd

warnings.filterwarnings("once", category=UserWarning)
warnings.warn("This is a warning", UserWarning)
warnings.warn("This is a warning", UserWarning)
warnings.warn("This is a second warning", UserWarning)
warnings.warn("This is a second warning", UserWarning)
pd.merge(pd.DataFrame({"A": [1, 2], "B": [3, 4]}), pd.DataFrame({"A": [5, 6], "C": [7, 8]}), on="A", how="inner")
warnings.warn("This is a warning", UserWarning)
warnings.warn("This is a warning", UserWarning)
warnings.warn("This is a second warning", UserWarning)
warnings.warn("This is a second warning", UserWarning)

returns

test3.py:6: UserWarning: This is a warning
  warnings.warn("This is a warning", UserWarning)
test3.py:8: UserWarning: This is a second warning        
  warnings.warn("This is a second warning", UserWarning)
test3.py:11: UserWarning: This is a warning
  warnings.warn("This is a warning", UserWarning)
test3.py:13: UserWarning: This is a second warning       
  warnings.warn("This is a second warning", UserWarning)

We'll check the filters on warnings before and after using merge/DataFrame below

import warnings
import pandas as pd
import copy

warnings.filterwarnings("once", category=UserWarning)
before = copy.deepcopy(warnings.filters)
pd.DataFrame()
after = copy.deepcopy(warnings.filters)
print(before == after)

returns

True

This confirms that the cause is most likely the warnings registry being reset. If anyone has any insight it would be greatly appreciated as I've hit a bit of a wall.

@rhshadrach
Copy link
Member

I believe this is python/cpython#73858

@rhshadrach rhshadrach added Upstream issue Issue related to pandas dependency Closing Candidate May be closeable, needs more eyeballs and removed Needs Triage Issue that has not been reviewed by a pandas team member labels Jan 8, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Closing Candidate May be closeable, needs more eyeballs Upstream issue Issue related to pandas dependency
Projects
None yet
Development

No branches or pull requests

3 participants