Skip to content

Commit

Permalink
Update README.md
Browse files Browse the repository at this point in the history
  • Loading branch information
bedroesb authored Nov 8, 2024
1 parent 1cc9ad4 commit b3e71e4
Showing 1 changed file with 54 additions and 170 deletions.
224 changes: 54 additions & 170 deletions mars-cli/README.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,8 @@
# Installing the mars-cli
# MARS-CLI

The MARS-CLI tool is a powerful interface for submitting metadata and associated files to various biological repository services like ENA, BioSamples, and MetaboLights. This command-line tool is useful for managing and validating metadata submissions in a ISA-JSON, as well as for automating aspects of repository submissions.

## Installation

This installation procedure describes a typical Linux installation. This application can perfectly work on Windows and MacOS but some of the steps might be different.

Expand Down Expand Up @@ -37,7 +41,7 @@ echo 'export MARS_SETTINGS_DIR=<path/to/parent_folder/containing/.mars>' >> $HOM

Once installed, the CLI application will be available from the terminal.

# Configuration
## Configuration

Installing this application will also generate a `settings.ini` file in `$HOME/.mars/`.

Expand All @@ -49,7 +53,34 @@ log_max_size = 1024
log_max_files = 5
```

## Logging
### Repository services

To configure MARS for submissions, modify the configuration file `settings.ini` located at `~/.mars/settings.ini`. Ensure the following content is set:

```ini
[webin]
development-url = https://wwwdev.ebi.ac.uk/ena/dev/submit/webin/auth
development-token-url = https://wwwdev.ebi.ac.uk/ena/dev/submit/webin/auth/token
production-url = https://www.ebi.ac.uk/ena/submit/webin/auth
production-token-url = https://www.ebi.ac.uk/ena/submit/webin/auth/token

[ena]
development-url = http://localhost:8042/isaena
development-submission-url = http://localhost:8042/isaena/submit
development-data-submission-url = webin2.ebi.ac.uk
production-url = https://www.ebi.ac.uk/ena/submit/webin-v2/
production-submission-url = https://www.ebi.ac.uk/ena/submit/drop-box/submit/?auth=ENA
production-data-submission-url = webin2.ebi.ac.uk

[biosamples]
development-url = http://localhost:8032/isabiosamples
development-submission-url = http://localhost:8032/isabiosamples/submit
production-url = https://www.ebi.ac.uk/biosamples/samples/
production-submission-url = https://www.ebi.ac.uk/biosamples/samples/
```


### Logging

The MARS-CLI will automatically log events to a `.log` file.

Expand All @@ -67,7 +98,7 @@ __log_max_size__: The maximum size in kB for the log file. By default the maximu

__log_max_files__: The maximum number of old log files to keep. By default, this is set to 5

## Target repository settings
### Target repository settings

Each of the target repositories have a set of settings:

Expand All @@ -76,7 +107,7 @@ Each of the target repositories have a set of settings:
- production-url: URL to the production server when performing a health-check
- production-submission-url: URL to the production server when performing a submissionW

# Using the MARS-CLI
## Usage

If you wish to use a different location for the `.mars' folder:

Expand All @@ -85,8 +116,6 @@ export MARS_SETTINGS_DIR=<path/to/parent_folder/containing/.mars>
mars-cli [options] <command> ARGUMENT
```

## Help

The mars-cli's help text can be found from the command line as such:

```sh
Expand Down Expand Up @@ -146,13 +175,13 @@ Options:
--help Show this message and exit.
```

## Development
### Development vs production

By default, the mars-CLI will try to submit the ISA-JSON's metadata towards the repositories' production servers.
Passing the development flag will run it in development mode and substitute the production servers with the development
servers.

## Health check repository services
### Health check repository services

You can check whether the supported repositories are healthy, prior to submission, by doing a health-check.

Expand All @@ -172,11 +201,11 @@ Service 'ena' healthy and ready to use!
Service 'biosamples' healthy and ready to use!
```

## using the keychain
### Credential management

This CLI application comes with functionality to interact with your device's keychain backend.
This CLI application comes with functionality to interact with your device's keychain backend in order to fetch the necessary credentials.

### Store a password
#### Store a password

You can add a password to keychain:

Expand All @@ -194,9 +223,6 @@ Options:
--help Show this message and exit.
```

## Submitting to repository services

The MARS-CLI tool is a powerful interface for submitting metadata and associated files to various biological repository services like ENA, BioSamples, and MetaboLights. This command-line tool is useful for managing and validating metadata submissions in a isa-json, as well as for automating aspects of repository submissions. Below are command line options for submitting to each of these repository services.

### Options

Expand Down Expand Up @@ -282,170 +308,22 @@ the flag `--investigation-is-root` to `True` in order to validate the ISA-JSON.
mars-cli validate-isa-json my-isa-investigation.json
```

# Extending BioSamples' records
The Python script ``biosamples-externalReferences.py`` defines a class BiosamplesRecord for managing biosample records. This class is designed to interact with the BioSamples database, allowing operations like fetching, updating, and extending biosample records.
## Feature: Extending BioSamples' records
> **Status**: 🚧 To Be Developed
This part is designed to interact with the BioSamples database, allowing operations like fetching, updating, and extending biosample records.
The script takes in a dictionary of BioSamples' accessions and their associated external references, and expands the former with the latter.


To summarize, the steps of the code are:
1. Takes the BioSamples' submitter credentials and an input file containing a set of BioSamples accessions and their associated external references
1. Takes the BioSamples' submitter credentials and a set of BioSamples accessions and their associated external references
2. Validates inputs
3. For each BioSamples' accession, it downloads its JSON record from BioSamples
4. Extend the BioSamples' JSON with the ``externalReferences`` of the input file
5. Submit the extended JSON to BioSamples to replace the existing one

## Examples
### BioSamples JSON
Mock example ([``SAMEA112654119``](https://www.ebi.ac.uk/biosamples/samples/SAMEA112654119)):
- Record (JSON) **before** extending with ``externalReferences``:
````
{
"name" : "AngH91",
"accession" : "SAMEA112654119",
...
}
````
- Record (JSON) **after** extending with ``externalReferences``:
````
{
"name" : "AngH91",
"accession" : "SAMEA112654119",
...
"externalReferences" : [ {
"url" : "https://ega-archive.org/datasets/EGAD00010002458",
"duo" : [ ]
}, {
"url" : "https://ega-archive.org/metadata/v2/samples/EGAN00004248937",
"duo" : [ ]
}, {
"url" : "https://www.ebi.ac.uk/ena/browser/view/SAMEA112654119",
"duo" : [ ]
} ]
...
}
````
### Script input
In the following example, we would be adding 3 URLs to ``SAMEA112654119`` and one to ``SAMEA419425`` as ``externalReferences``.
````
{
"biosampleExternalReferences": [
{
"biosampleAccession": "SAMEA112654119",
"externalReferences": [
{
"url": "https://ega-archive.org/datasets/EGAD00010002458"
},
{
"url": "https://ega-archive.org/metadata/v2/samples/EGAN00004248937"
},
{
"url": "https://www.ebi.ac.uk/ena/browser/view/SAMEA112654119"
}
]
},
{
"biosampleAccession": "SAMEA419425",
"externalReferences": [
{
"url": "https://ega-archive.org/datasets/EGAD00010002458"
}
]
}
]
}
````
## Usage
### Command line
````bash
$ python3 biosamples-externalReferences.py --help
usage: biosamples-externalReferences.py [-h] [--production] biosamples_credentials biosamples_externalReferences

This script extends a set of existing Biosamples records with a list of provided external references.

positional arguments:
biosamples_credentials
Either a dictionary or filepath to the BioSamples credentials.
biosamples_externalReferences
Either a dictionary or filepath to the BioSamples' accessions mapping with external references.
options:
-h, --help show this help message and exit
--production Boolean indicating the usage of the production environment of BioSamples. If not present, the development instance will be used.
````
### Interfacing with BiosamplesRecord Class in Java [_By ChatGPT_]
#### Prerequisites
- **Jython**: A Java implementation of the Python interpreter. It allows running Python code within a Java application.
- **Environment Setup**: Ensure Python and all necessary libraries (``requests``, ``json``, etc.) are installed and accessible to Jython.
#### Basic Steps for Integration
1. **Importing Jython in Java**: Add Jython as a dependency in your Java project.
1. **Executing Python Script**: Use Jython's ``PythonInterpreter`` class to execute the Python script.
1. **Creating BiosamplesRecord Instance**: Instantiate the BiosamplesRecord class through the interpreter.
1. **Interacting with BiosamplesRecord Methods**: Utilize methods like ``fetch_bs_json``, ``extend_externalReferences``, etc., via the interpreter.
1. **Integrating with the Main Function**:
- The ``main`` function in the script acts as an entry point for command-line usage.
- In Java, replicate the logic in ``main``.
1. **Data Handling**: Data passed between Java and Python must be in a compatible format (e.g., JSON).
1. **Error Handling**: Properly handle Python exceptions raised by the script in Java.

Sample Java Integration Code:
````java
import org.python.util.PythonInterpreter;
import org.python.core.*;

public class BiosamplesIntegration {
public static void main(String[] args) {
PythonInterpreter interpreter = new PythonInterpreter();

// Load and execute Python script
interpreter.execfile("path/to/biosamples-externalReferences.py");

// Create a BiosamplesRecord instance
PyObject biosamplesRecordClass = interpreter.get("BiosamplesRecord");
PyObject biosamplesRecord = biosamplesRecordClass.__call__(new PyString("SAMPLE_ACCESSSION"));

// Use methods of BiosamplesRecord
PyObject result = biosamplesRecord.invoke("fetch_bs_json", new PyString("biosamples_endpoint"));
System.out.println(result.toString());




// Handle other operations similarly
}
}
````
## Deploy repository services

[To set up and run the MARS tool locally using Docker, follow these steps](../repository-services/README.md)

## Configuration

To configure MARS for submissions, modify the configuration file `settings.ini` located at `~/.mars/settings.ini`. Ensure the following content is set:

```ini
[webin]
development-url = https://wwwdev.ebi.ac.uk/ena/dev/submit/webin/auth
development-token-url = https://wwwdev.ebi.ac.uk/ena/dev/submit/webin/auth/token
production-url = https://www.ebi.ac.uk/ena/submit/webin/auth
production-token-url = https://www.ebi.ac.uk/ena/submit/webin/auth/token

[ena]
development-url = http://localhost:8042/isaena
development-submission-url = http://localhost:8042/isaena/submit
development-data-submission-url = webin2.ebi.ac.uk
production-url = https://www.ebi.ac.uk/ena/submit/webin-v2/
production-submission-url = https://www.ebi.ac.uk/ena/submit/drop-box/submit/?auth=ENA
production-data-submission-url = webin2.ebi.ac.uk

[biosamples]
development-url = http://localhost:8032/isabiosamples
development-submission-url = http://localhost:8032/isabiosamples/submit
production-url = https://www.ebi.ac.uk/biosamples/samples/
production-submission-url = https://www.ebi.ac.uk/biosamples/samples/
```

## Running MARS Submission
## Examples

### Submit isa-json to biosamples

Expand All @@ -469,6 +347,12 @@ python mars_cli.py --development submit --submit-to-metabolights False --submit-

### Submit data files and isa-json and to biosamples and ENA


## Deploy repository services

[To set up and run the MARS tool locally using Docker, follow these steps](../repository-services/README.md)


```bash
python mars_cli.py --credential-service-name biosamples --username-credentials <username> --file-transfer ftp --data-files ../data/ENA_data.R1.fastq.gz --submit-to-metabolights False --output final-isa ../data/biosamples-input-isa.json
```
```

0 comments on commit b3e71e4

Please sign in to comment.