You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When try to setup and run GW on a new machine, a big task is to transfer the "fix" data to the new machine. "fix" is huge, and we do not need the whole data set at the beginning. So we want to create a utility tools, which can fetch the necessary subset as suggested by Rahul in an email:
What would be nice to have as a utility in the workflow is the ability to fetch (and update) the data from S3 to a local machine.
With such a utility:
The user can query what datasets are available on the bucket
The user can fetch the entire dataset (all resolutions, latest set of date timestamped dataset)
The user can fetch the dataset for a subset of resolutions e.g. C48mx500 for the latest timestamped dataset.
The script only fetches the updates if the current destination already has a subset (This is to prevent fetching the same data over and over). e.g. --update
As Walter mentions, this is done just once by the g-w codemanagers on the HPC platforms where this is shared between many users. For the community running in their own HPCs or containers, this process can be tedious. A proper solution for community support would be welcomed.
What are the requirements for the new functionality?
What would be nice to have as a utility in the workflow is the ability to fetch (and update) the data from S3 to a local machine.
With such a utility:
The user can query what datasets are available on the bucket
The user can fetch the entire dataset (all resolutions, latest set of date timestamped dataset)
The user can fetch the dataset for a subset of resolutions e.g. C48mx500 for the latest timestamped dataset.
The script only fetches the updates if the current destination already has a subset (This is to prevent fetching the same data over and over). e.g. --update
As Walter mentions, this is done just once by the g-w codemanagers on the HPC platforms where this is shared between many users. For the community running in their own HPCs or containers, this process can be tedious. A proper solution for community support would be welcomed.
What new functionality do you need?
When try to setup and run GW on a new machine, a big task is to transfer the "fix" data to the new machine. "fix" is huge, and we do not need the whole data set at the beginning. So we want to create a utility tools, which can fetch the necessary subset as suggested by Rahul in an email:
What would be nice to have as a utility in the workflow is the ability to fetch (and update) the data from S3 to a local machine.
With such a utility:
The user can query what datasets are available on the bucket
The user can fetch the entire dataset (all resolutions, latest set of date timestamped dataset)
The user can fetch the dataset for a subset of resolutions e.g. C48mx500 for the latest timestamped dataset.
The script only fetches the updates if the current destination already has a subset (This is to prevent fetching the same data over and over). e.g. --update
As Walter mentions, this is done just once by the g-w codemanagers on the HPC platforms where this is shared between many users. For the community running in their own HPCs or containers, this process can be tedious. A proper solution for community support would be welcomed.
What are the requirements for the new functionality?
What would be nice to have as a utility in the workflow is the ability to fetch (and update) the data from S3 to a local machine.
With such a utility:
The user can query what datasets are available on the bucket
The user can fetch the entire dataset (all resolutions, latest set of date timestamped dataset)
The user can fetch the dataset for a subset of resolutions e.g. C48mx500 for the latest timestamped dataset.
The script only fetches the updates if the current destination already has a subset (This is to prevent fetching the same data over and over). e.g. --update
As Walter mentions, this is done just once by the g-w codemanagers on the HPC platforms where this is shared between many users. For the community running in their own HPCs or containers, this process can be tedious. A proper solution for community support would be welcomed.
Acceptance Criteria
Suggest a solution (optional)
No response
The text was updated successfully, but these errors were encountered: