-
Notifications
You must be signed in to change notification settings - Fork 563
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: add helper functions to parse input files #2918
base: main
Are you sure you want to change the base?
Conversation
Quality Gate passedIssues Measures |
Quality Gate passedIssues Measures |
22c1547
to
d07c96d
Compare
📝 WalkthroughWalkthroughThe pull request introduces two new utility functions, Changes
Sequence DiagramsequenceDiagram
participant User as Snakemake Workflow
participant ParseInput as parse_input()
participant ExtractChecksum as extract_checksum()
participant InputFile as Input File
User->>ParseInput: Call with input file and parser
ParseInput->>InputFile: Open and read file
alt No custom parser
ParseInput-->>User: Return file content
else Custom parser provided
ParseInput->>ExtractChecksum: Apply parser function
ExtractChecksum->>InputFile: Read specific data
ExtractChecksum-->>ParseInput: Return parsed value
ParseInput-->>User: Return parsed result
end
The sequence diagram illustrates the workflow of the new Finishing Touches
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
Quality Gate passedIssues Measures |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (1)
snakemake/ioutils/input.py (1)
1-9
: Add error handling for file operations.The function should handle potential IOErrors when opening/reading the file and provide meaningful error messages.
def parse_input(infile=None, parser=None, **kwargs): def inner(wildcards, input, output): - with open(infile, "r") as fh: - if parser is None: - return fh.read().strip() - else: - return parser(fh, **kwargs) + try: + with open(infile, "r") as fh: + if parser is None: + return fh.read().strip() + else: + return parser(fh, **kwargs) + except IOError as e: + raise WorkflowError(f"Error reading input file {infile}: {str(e)}") return inner
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (7)
docs/snakefiles/rules.rst
(2 hunks)snakemake/ioutils/__init__.py
(2 hunks)snakemake/ioutils/input.py
(1 hunks)tests/test_ioutils/Snakefile
(4 hunks)tests/test_ioutils/expected-results/c/1.txt
(1 hunks)tests/test_ioutils/expected-results/results/switch~someswitch.column~sample.txt
(1 hunks)tests/test_ioutils/samples.md5
(1 hunks)
✅ Files skipped from review due to trivial changes (3)
- tests/test_ioutils/expected-results/c/1.txt
- tests/test_ioutils/expected-results/results/switch
someswitch.columnsample.txt - tests/test_ioutils/samples.md5
🧰 Additional context used
📓 Path-based instructions (2)
snakemake/ioutils/__init__.py (1)
Pattern **/*.py
: Do not try to improve formatting.
Do not suggest type annotations for functions that are defined inside of functions or methods.
Do not suggest type annotation of the self
argument of methods.
Do not suggest type annotation of the cls
argument of classmethods.
Do not suggest return type annotation if a function or method does not contain a return
statement.
snakemake/ioutils/input.py (1)
Pattern **/*.py
: Do not try to improve formatting.
Do not suggest type annotations for functions that are defined inside of functions or methods.
Do not suggest type annotation of the self
argument of methods.
Do not suggest type annotation of the cls
argument of classmethods.
Do not suggest return type annotation if a function or method does not contain a return
statement.
🪛 Ruff (0.8.2)
snakemake/ioutils/input.py
15-15: Do not assign a lambda
expression, use a def
Rewrite fix_file_name
as a def
(E731)
⏰ Context from checks skipped due to timeout of 90000ms (31)
- GitHub Check: testing (10, 3.12)
- GitHub Check: testing (10, 3.11)
- GitHub Check: testing (9, 3.12)
- GitHub Check: testing (9, 3.11)
- GitHub Check: testing (8, 3.12)
- GitHub Check: testing (8, 3.11)
- GitHub Check: testing (7, 3.12)
- GitHub Check: testing (7, 3.11)
- GitHub Check: testing (6, 3.12)
- GitHub Check: testing (6, 3.11)
- GitHub Check: testing-windows (10)
- GitHub Check: testing (5, 3.12)
- GitHub Check: testing-windows (9)
- GitHub Check: testing (5, 3.11)
- GitHub Check: testing-windows (8)
- GitHub Check: testing (4, 3.12)
- GitHub Check: testing-windows (7)
- GitHub Check: testing (4, 3.11)
- GitHub Check: testing-windows (6)
- GitHub Check: testing (3, 3.12)
- GitHub Check: testing-windows (5)
- GitHub Check: testing (3, 3.11)
- GitHub Check: testing-windows (4)
- GitHub Check: testing (2, 3.12)
- GitHub Check: testing-windows (3)
- GitHub Check: testing (2, 3.11)
- GitHub Check: testing (1, 3.12)
- GitHub Check: testing-windows (2)
- GitHub Check: testing (1, 3.11)
- GitHub Check: testing-windows (1)
- GitHub Check: apidocs
🔇 Additional comments (3)
snakemake/ioutils/__init__.py (1)
8-8
: LGTM!The registration of new functions follows the established pattern correctly.
Also applies to: 25-26
tests/test_ioutils/Snakefile (1)
11-13
: LGTM!The test coverage is comprehensive and demonstrates proper usage of the new functionality.
Also applies to: 22-29
docs/snakefiles/rules.rst (1)
550-592
: LGTM!The documentation is clear, well-structured, and includes helpful examples.
def extract_checksum(infile, **kwargs): | ||
import pandas as pd | ||
|
||
fix_file_name = lambda x: x.removeprefix("./") | ||
return ( | ||
pd.read_csv( | ||
infile, | ||
sep=" ", | ||
header=None, | ||
engine="python", | ||
converters={1: fix_file_name}, | ||
) | ||
.set_index(1) | ||
.loc[fix_file_name(kwargs.get("file"))] | ||
.item() | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Multiple improvements needed for extract_checksum function.
- Replace lambda with a proper function per static analysis
- Add docstring and type hints
- Add error handling for pandas operations
+def fix_file_name(x: str) -> str:
+ """Remove ./ prefix from file names."""
+ return x.removeprefix("./")
+
+def extract_checksum(infile: str, **kwargs) -> str:
+ """Extract checksum from a CSV file.
+
+ Args:
+ infile: Path to CSV file containing checksums
+ **kwargs: Additional arguments, must include 'file' key
+
+ Returns:
+ str: The extracted checksum
+
+ Raises:
+ WorkflowError: If file not found or checksum extraction fails
+ """
import pandas as pd
-
- fix_file_name = lambda x: x.removeprefix("./")
- return (
- pd.read_csv(
- infile,
- sep=" ",
- header=None,
- engine="python",
- converters={1: fix_file_name},
- )
- .set_index(1)
- .loc[fix_file_name(kwargs.get("file"))]
- .item()
- )
+ try:
+ df = pd.read_csv(
+ infile,
+ sep=" ",
+ header=None,
+ engine="python",
+ converters={1: fix_file_name}
+ )
+ return df.set_index(1).loc[fix_file_name(kwargs["file"])].item()
+ except FileNotFoundError:
+ raise WorkflowError(f"Checksum file not found: {infile}")
+ except KeyError:
+ raise WorkflowError(f"File {kwargs['file']} not found in checksum file")
+ except Exception as e:
+ raise WorkflowError(f"Error extracting checksum: {str(e)}")
Committable suggestion skipped: line range outside the PR's diff.
🧰 Tools
🪛 Ruff (0.8.2)
15-15: Do not assign a lambda
expression, use a def
Rewrite fix_file_name
as a def
(E731)
Add helper functions to parse input files. See example in snakemake/snakemake-wrappers#2725.
QC
docs/
) is updated to reflect the changes or this is not necessary (e.g. if the change does neither modify the language nor the behavior or functionalities of Snakemake).Summary by CodeRabbit
Release Notes
New Features
Documentation
Tests