Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs: 🏗️ move .puml into pseudocode #1021

Open
wants to merge 7 commits into
base: main
Choose a base branch
from

Conversation

lwjohnst86
Copy link
Member

Description

This PR moves the PlantUML diagram over into pseudocode and also adds a basic Mermaid diagram of the input and output flow.

This PR needs an in-depth review.

Checklist

  • Updated documentation
  • Ran just run-all

@lwjohnst86 lwjohnst86 requested a review from a team as a code owner January 29, 2025 09:19

- Can it be at a minimal read without problems or warnings?
- Do the columns in the data file match those in the properties?
- Do the data types in the data file match those in the properties?
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These were the initial ones I was thinking about, but I guess as we use it in examples and real-world data, we could add more. Could even eventually move this function out into the checks package.

docs/design/interface/python-functions.qmd Outdated Show resolved Hide resolved
Copy link
Contributor

@martonvago martonvago left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense in general, just added some comments!

Comment on lines 5 to 11
Copy the file from `data_path` over into the resource location given by
`path`. This will compress the file and use a timestamped, unique file
name to store it as a backup. See the
[design](https://sprout.seedcase-project.org/docs/design/) docs for an
explanation of this file. Use `path_resource_raw()` to provide the
correct `path` location. Copies and compresses the file, and outputs the
path object of the created file.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe include that the data is checked against the metadata?

docs/design/interface/python-functions.qmd Outdated Show resolved Hide resolved
@@ -0,0 +1,116 @@
# ruff: noqa
def write_resource_data_to_raw(data_path, resource_properties) -> Path:
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed the path so we can use the path properties instead. One thing we need to consider is the location of where this function will run. We either need to figure a way to give an absolute path, or restrict this function to only running in a directory that has a datapackage.json (so it know's where root is). Or, we have a function to seek out what the root of the package is, if this is run from a subfolder.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good point! Allowing it to run in the root folder or any subfolder of that seems okay to me.

@lwjohnst86 lwjohnst86 requested a review from martonvago January 31, 2025 08:37
check_is_supported_format(data_path)
check_data_basics(data_path, resource_properties)
check_data_constraints(data_path, resource_properties)
raw_dir = Path(resource_properties.path / "raw")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think path is resources/id/data.parquet, but this a minor point.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: In Review
Development

Successfully merging this pull request may close these issues.

2 participants