Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change from java-based EGA downloader to pyega3 regarding legacy data #583

Open
famosab opened this issue Jan 8, 2024 · 0 comments
Open
Labels
new-feature Request is a new feature

Comments

@famosab
Copy link

famosab commented Jan 8, 2024

Manifest files that are generated in the DCC require a java-based download client that is not maintained anymore and not even linked in the ICGC ARGO docs covering the legacy data from the ICGC 25k project.

Detailed Description

The dcc documentation states, that one needs to use the EGA Download client to access ICGC data stored with EGA. I found that this client is not really supported anymore and that EGA now usually points to their Python based pyega3 client. This is denoted in the argo documentation, however, the auto-generated manifest files in DCC still require the java based client. Only few adaptations are necessary to switch to pyega3.

Possible Implementation

The following explanation could be added to the documentation. The manifest generation could be changed but that might not be worth the time since the DCC will be retired soon.


You will need to

  1. Install pyega3 v.5.1.0 or higher
  2. Adapt the auto-generated manifest file to look like the following file (most likely you will only need to update the file_ids & mapping variables by copying from the auto-generated manifest file):
#!/bin/bash
###############################################################################
# Manifest
###############################################################################

file_ids="EGAF00001074790"
mapping="{EGAF00001074790=FI41441}

###############################################################################
# Checking
###############################################################################

if ! command -v python &>/dev/null; then
   echo "Python not found. Exiting..."
   exit 2
fi

###############################################################################
# Request
###############################################################################

echo "Requesting $mapping..."

for file_id in $file_ids
do
  echo "Requesting $file_id..."
  pyega3 -c 20 -ms 1073741824 -cf conf.pyega3.json fetch $file_id
done

echo "Finished!"
  1. Provide a second file called conf.pyega3.json which looks like:
{
   "username":"<your email registered with EGA>",
   "password":"<your password registered with EGA>"
}

Note that this issue is mostly a duplicate of a post in the ICGC discuss forum. I was not sure how often people check the posts there.

@famosab famosab added the new-feature Request is a new feature label Jan 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
new-feature Request is a new feature
Projects
None yet
Development

No branches or pull requests

1 participant