Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GH-109408: Move the C file whitespace check from patchcheck to pre-commit #109890

Merged
merged 11 commits into from
Oct 10, 2023
15 changes: 15 additions & 0 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,21 @@ repos:
types: [python]
exclude: '^(Lib/test/tokenizedata/|Tools/c-analyzer/cpython/_parser).*$'

- repo: local
hooks:
- id: c-file-whitespace
name: "Check C file whitespace"
entry: "python Tools/patchcheck/untabify.py"
language: "system"
types_or: ['c', 'c++']
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We were running on .c and .h files before:

c_files = [fn for fn in file_paths if fn.endswith(('.c', '.h'))]

Both of those are matched by the c type, so this would be closer to parity:

Suggested change
types_or: ['c', 'c++']
types_or: [c]

We do have half a dozen .cpp files in the codebase, do we want to expand to include them? Should we also add c++ for trailing-whitespace above?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test passed with 'c++' included, I imagined that it was a previous oversight that they weren't included. I'll check if the trailing whitespace check also passes.

Copy link
Member Author

@AA-Turner AA-Turner Sep 26, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tools/msi/bundle/bootstrap/PythonBootstrapperApplication.cpp fails with 16 lines changed. Other than that all good (@zooba would you be alright with us enabling the trailing whitespace check here? No real views either way, if you'd prefer to keep the whitespace then that's the status quo anyway!)

A

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(We can always remove c++ later if needed.)

# Don't check the style of vendored libraries
exclude: |
(?x)^(
Modules/_decimal/.*
| Modules/libmpdec/.*
| Modules/expat/.*
)$

- repo: https://github.com/sphinx-contrib/sphinx-lint
rev: v0.6.8
hooks:
Expand Down
88 changes: 3 additions & 85 deletions Tools/patchcheck/patchcheck.py
Original file line number Diff line number Diff line change
@@ -1,29 +1,15 @@
#!/usr/bin/env python3
"""Check proposed changes for common issues."""
import re
import sys
import shutil
import os.path
import subprocess
import sysconfig

import untabify


def get_python_source_dir():
src_dir = sysconfig.get_config_var('abs_srcdir')
if not src_dir:
src_dir = sysconfig.get_config_var('srcdir')
return os.path.abspath(src_dir)


# Excluded directories which are copies of external libraries:
# don't check their coding style
EXCLUDE_DIRS = [
os.path.join('Modules', '_decimal', 'libmpdec'),
os.path.join('Modules', 'expat'),
os.path.join('Modules', 'zlib'),
]
SRCDIR = get_python_source_dir()


Expand Down Expand Up @@ -154,47 +140,8 @@ def changed_files(base_branch=None):
else:
sys.exit('need a git checkout to get modified files')

filenames2 = []
for filename in filenames:
# Normalize the path to be able to match using .startswith()
filename = os.path.normpath(filename)
if any(filename.startswith(path) for path in EXCLUDE_DIRS):
# Exclude the file
continue
filenames2.append(filename)

return filenames2


def report_modified_files(file_paths):
count = len(file_paths)
if count == 0:
return n_files_str(count)
else:
lines = [f"{n_files_str(count)}:"]
for path in file_paths:
lines.append(f" {path}")
return "\n".join(lines)


#: Python files that have tabs by design:
_PYTHON_FILES_WITH_TABS = frozenset({
'Tools/c-analyzer/cpython/_parser.py',
})


@status("Fixing C file whitespace", info=report_modified_files)
def normalize_c_whitespace(file_paths):
"""Report if any C files """
fixed = []
for path in file_paths:
abspath = os.path.join(SRCDIR, path)
with open(abspath, 'r') as f:
if '\t' not in f.read():
continue
untabify.process(abspath, 8, verbose=False)
fixed.append(path)
return fixed
# Normalize the path to be able to match using str.startswith()
return list(map(os.path.normpath, filenames))


@status("Docs modified", modal=True)
Expand Down Expand Up @@ -234,33 +181,12 @@ def regenerated_pyconfig_h_in(file_paths):
return "not needed"


def ci(pull_request):
if pull_request == 'false':
print('Not a pull request; skipping')
return
base_branch = get_base_branch()
file_paths = changed_files(base_branch)
c_files = [fn for fn in file_paths if fn.endswith(('.c', '.h'))]
fixed = []
fixed.extend(normalize_c_whitespace(c_files))
if not fixed:
print('No whitespace issues found')
else:
count = len(fixed)
print(f'Please fix the {n_files_str(count)} with whitespace issues')
print('(on Unix you can run `make patchcheck` to make the fixes)')
sys.exit(1)


def main():
base_branch = get_base_branch()
file_paths = changed_files(base_branch)
c_files = [fn for fn in file_paths if fn.endswith(('.c', '.h'))]
doc_files = [fn for fn in file_paths if fn.startswith('Doc') and
fn.endswith(('.rst', '.inc'))]
misc_files = {p for p in file_paths if p.startswith('Misc')}
# C rules enforcement.
normalize_c_whitespace(c_files)
# Docs updated.
docs_modified(doc_files)
# Misc/ACKS changed.
Expand All @@ -283,12 +209,4 @@ def main():


if __name__ == '__main__':
import argparse
parser = argparse.ArgumentParser(description=__doc__)
parser.add_argument('--ci',
help='Perform pass/fail checks')
args = parser.parse_args()
if args.ci:
ci(args.ci)
else:
main()
main()
10 changes: 5 additions & 5 deletions Tools/patchcheck/untabify.py
Original file line number Diff line number Diff line change
Expand Up @@ -21,8 +21,7 @@ def main():
if optname == '-t':
tabsize = int(optvalue)

for filename in args:
process(filename, tabsize)
return max(process(filename, tabsize) for filename in args)


def process(filename, tabsize, verbose=True):
Expand All @@ -32,10 +31,10 @@ def process(filename, tabsize, verbose=True):
encoding = f.encoding
except IOError as msg:
print("%r: I/O error: %s" % (filename, msg))
return
return 2
newtext = text.expandtabs(tabsize)
if newtext == text:
return
return 0
backup = filename + "~"
try:
os.unlink(backup)
Expand All @@ -49,7 +48,8 @@ def process(filename, tabsize, verbose=True):
f.write(newtext)
if verbose:
print(filename)
return 1


if __name__ == '__main__':
main()
raise SystemExit(main())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

process() now has return values, which go to main(), but main() always returns None so we never get any errors here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I didn't realise! pre-commit was failing in my local testing when I introduced tabs, perhaps it just fails when a file is changed after the hook has run.