Skip to content

Commit

Permalink
Merge pull request #16 from gustaveroussy/dev
Browse files Browse the repository at this point in the history
Dev
  • Loading branch information
quentinblampey authored Jan 25, 2024
2 parents 3577f9d + a348bc6 commit 67c877c
Show file tree
Hide file tree
Showing 14 changed files with 171 additions and 73 deletions.
6 changes: 6 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,9 @@
## [1.0.x] - tbd

### Added
- The `phenocycler` reader can now also read `.tif` files (not just `.qptiff`)
- Added missing legend in the HTML report under the "Channels" section (#15)

## [1.0.2] - 2024-01-15

### Fix
Expand Down
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@
[![License](https://img.shields.io/pypi/l/sopa.svg)](https://github.com/gustaveroussy/sopa/blob/master/LICENSE)
[![Imports: isort](https://img.shields.io/badge/imports-isort-blueviolet)](https://pycqa.github.io/isort/)

Built on top of [SpatialData](https://github.com/scverse/spatialdata), Sopa enables processing and analyses of image-based spatial-omics using a standard data structure and output. We currently support the following technologies: Xenium, MERSCOPE, CosMX, PhenoCycler, MACSIMA, Hyperion. Sopa was designed for generability and low memory consumption on large images (scales to `1TB+` images).
Built on top of [SpatialData](https://github.com/scverse/spatialdata), Sopa enables processing and analyses of image-based spatial-omics using a standard data structure and output. We currently support the following technologies: Xenium, MERSCOPE, CosMX, PhenoCycler, MACSima, Hyperion. Sopa was designed for generability and low memory consumption on large images (scales to `1TB+` images).

The pipeline outputs contain: (i) Xenium Explorer files for interactive visualization, (ii) an HTML report for quick quality controls, and (iii) a SpatialData `.zarr` directory for further analyses.

Expand Down
4 changes: 0 additions & 4 deletions docs/api/io.md
Original file line number Diff line number Diff line change
Expand Up @@ -24,10 +24,6 @@
options:
show_root_heading: true

::: sopa.io.qptiff
options:
show_root_heading: true

::: sopa.io.hyperion
options:
show_root_heading: true
Expand Down
2 changes: 1 addition & 1 deletion docs/cli.md
Original file line number Diff line number Diff line change
Expand Up @@ -413,7 +413,7 @@ $ sopa read [OPTIONS] DATA_PATH

**Options**:

* `--technology TEXT`: Name of the technology used to collected the data (`xenium`/`merfish`/`cosmx`/`phenocycler`/`macsima`/`qptiff`/`hyperion`)
* `--technology TEXT`: Name of the technology used to collected the data (`xenium`/`merfish`/`cosmx`/`phenocycler`/`macsima`/`hyperion`)
* `--sdata-path TEXT`: Optional path to write the SpatialData object. If not provided, will write to the `{data_path}.zarr` directory
* `--config-path TEXT`: Path to the snakemake config. This can be useful in order not to provide the `--technology` and the `--kwargs` arguments
* `--kwargs TEXT`: Dictionary provided to the reader function as kwargs [default: {}]
Expand Down
11 changes: 6 additions & 5 deletions docs/faq.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,30 +6,31 @@ You need the raw inputs of your machine, that is:

- Optionally, a file of transcript location, usually a `.csv` or `.parquet` file

Our tutorials use `data_path` to denote the path to your raw data. Select the correct tab below to understand what is the right path to your raw data:
In this documentation, `data_path` denotes the path to your raw data. Select the correct tab below to understand what is the right path to your raw data:

=== "Xenium"
`data_path` is the directory containing the following files: `morphology.ome.tif` and `transcripts.parquet`
=== "MERSCOPE"
`data_path` is the "region" directory containing a `detected_transcripts.csv` file and an `image` directory
`data_path` is the "region" directory containing a `detected_transcripts.csv` file and an `image` directory. For instance, the directory can be called `region_0`.
=== "CosMX"
(More details coming soon)
(The CosMX data requires stitching the FOVs. It will be added soon, see [this issue](https://github.com/gustaveroussy/sopa/issues/5))
=== "MACSima"
`data_path` is the directory containing multiple `.ome.tif` files (one file per channel)
=== "PhenoCycler"
`data_path` corresponds to the path to one `.qptiff` file
`data_path` corresponds to the path to one `.qptiff` file, or one `.tif` file (if exported from QuPath)
=== "Hyperion"
`data_path` is the directory containing multiple `.ome.tiff` files (one file per channel)

## Cellpose is not segmenting enough cells; what should I do?

- The main Cellpose parameter to check is `diameter`, i.e. a typical cell diameter **in pixels**. Note that this is highly specific to the technology you're using since the micron-to-pixel ratio can differ. We advise you to start with the default parameter for your technology of interest (see the `diameter` parameter inside our config files [here](https://github.com/gustaveroussy/sopa/tree/master/workflow/config)).
- Maybe `min_area` is too high, and all the cells are filtered because they are smaller than this area. Remind that, when using Cellpose, the areas correspond to pixels^2.
- This can be due to a low image quality. If the image is too pixelated, consider increasing `gaussian_sigma` (e.g., `2`) under the cellpose parameters of our config. If the image has a low contrast, consider increasing `clip_limit` (e.g., `0.3`). These parameters are detailed in [this example config](https://github.com/gustaveroussy/sopa/blob/master/workflow/config/example_commented.yaml).
- Consider updating the official Cellpose parameters. In particular, try `cellprob_threshold=-6` and `flow_threshold=2`.

## Can I use Nextflow instead of Snakemake?

Nextflow is not supported yet, but we are working on it. You can also help re-write our Snakemake pipeline for Nextflow.
Nextflow is not supported yet, but we are working on it. You can also help re-write our Snakemake pipeline for Nextflow (see issue [#7](https://github.com/gustaveroussy/sopa/issues/7)).

## I have another issue; how do I fix it?

Expand Down
7 changes: 5 additions & 2 deletions docs/tutorials/api_usage.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@
"\n",
"For this tutorial, we use a generated dataset. The command below will generate and save it on disk (you can change the path `tuto.zarr` to save it somewhere else).\n",
"\n",
"See [here](`../../api/io`) for details on how to use your own technology."
"See the commented lines below to load your own data, or see the [`sopa.io` API](../../api/io)."
]
},
{
Expand Down Expand Up @@ -55,9 +55,12 @@
}
],
"source": [
"# The line below creates a toy dataset for this tutorial\n",
"# Instead, use sopa.io to read your own data as a SpatialData object: see https://gustaveroussy.github.io/sopa/api/io/\n",
"# For instance, if you have MERSCOPE data, you can do `sdata = sopa.io.merscope(\"/path/to/region_0\")`\n",
"sdata = uniform()\n",
"sdata.write(\"tuto.zarr\", overwrite=True)\n",
"\n",
"sdata.write(\"tuto.zarr\", overwrite=True)\n",
"sdata"
]
},
Expand Down
47 changes: 41 additions & 6 deletions docs/tutorials/cli_usage.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,15 +2,50 @@ Here, we provide a minimal example of command line usage. For more details and t

## Save the `SpatialData` object

For this tutorial, we use a generated dataset. The command below will generate and save it on disk (you can change the path `tuto.zarr` to save it somewhere else). See [here](`../../cli/#sopa-read`) for details on how to use your own technology.
For this tutorial, we use a generated dataset. The command below will generate and save it on disk (you can change the path `tuto.zarr` to save it somewhere else). If you want to load your own data: choose the right panel below, or see the [`sopa read` CLI documentation](`../../cli/#sopa-read`).

=== "Tutorial"
```sh
# it will generate a 'tuto.zarr' directory
sopa read . --sdata-path tuto.zarr --technology uniform
```
=== "Xenium"
```sh
# it will generate a '/path/to/sample/directory.zarr' directory
sopa read /path/to/sample/directory --technology xenium
```
=== "MERSCOPE"
```sh
# it will generate a '/path/to/sample/directory.zarr' directory
sopa read /path/to/sample/directory --technology merscope
```
=== "CosMX"
```sh
# it will generate a '/path/to/sample/directory.zarr' directory
sopa read /path/to/sample/directory --technology cosmx
```

!!! warning
The CosMX data requires stitching the FOVs. It will be added soon, see [this issue](https://github.com/gustaveroussy/sopa/issues/5).
=== "PhenoCycler"
```sh
# it will generate a '/path/to/sample.zarr' directory
sopa read /path/to/sample.qptiff --technology phenocycler
```
=== "MACSima"
```sh
# it will generate a '/path/to/sample/directory.zarr' directory
sopa read /path/to/sample/directory --technology macsima
```
=== "Hyperion"
```sh
# it will generate a '/path/to/sample/directory.zarr' directory
sopa read /path/to/sample/directory --technology hyperion
```

```sh
# this generates a 'tuto.zarr' directory
sopa read . --sdata-path tuto.zarr --technology uniform
```

!!! info
This generates a `.zarr` directory corresponding to a [`SpatialData` object](https://github.com/scverse/spatialdata).
It has created a `.zarr` directory which stores a [`SpatialData` object](https://github.com/scverse/spatialdata) corresponding to your data sample. You can choose the location of the `.zarr` directory using the `--sdata-path` command line argument.

## (Optional) ROI selection

Expand Down
4 changes: 4 additions & 0 deletions sopa/annotation/tangram/run.py
Original file line number Diff line number Diff line change
Expand Up @@ -158,6 +158,10 @@ def pp_adata(self, ad_sp_: AnnData, ad_sc_: AnnData, split: np.ndarray) -> AnnDa
set(ad_sp_split.var_names[ad_sp_split.var.counts > 0])
& set(ad_sc_.var_names[ad_sc_.var.counts > 0])
)

assert len(
selection
), f"No gene in common between the reference and the spatial adata object. Have you run transcript aggregation?"
log.info(f"Keeping {len(selection)} shared genes")

for ad_ in [ad_sp_split, ad_sc_]:
Expand Down
4 changes: 2 additions & 2 deletions sopa/cli/app.py
Original file line number Diff line number Diff line change
Expand Up @@ -48,7 +48,7 @@ def read(
),
technology: str = typer.Option(
None,
help="Name of the technology used to collected the data (`xenium`/`merfish`/`cosmx`/`phenocycler`/`macsima`/`qptiff`/`hyperion`)",
help="Name of the technology used to collected the data (`xenium`/`merfish`/`cosmx`/`phenocycler`/`macsima`/`hyperion`)",
),
sdata_path: str = typer.Option(
None,
Expand Down Expand Up @@ -91,7 +91,7 @@ def read(

assert hasattr(
io, technology
), f"Technology {technology} unknown. Currently available: xenium, merscope, cosmx, phenocycler, hyperion, macsima, qptiff"
), f"Technology {technology} unknown. Currently available: xenium, merscope, cosmx, phenocycler, hyperion, macsima"

sdata = getattr(io, technology)(data_path, **kwargs)
io.write_standardized(sdata, sdata_path, delete_table=True)
Expand Down
2 changes: 1 addition & 1 deletion sopa/cli/explorer.py
Original file line number Diff line number Diff line change
Expand Up @@ -116,7 +116,7 @@ def add_aligned(
from sopa.io.explorer.images import align

sdata = spatialdata.read_zarr(sdata_path)
image = io.imaging.ome_tif(image_path)
image = io.imaging.ome_tif(image_path, as_image=True)

align(
sdata, image, transformation_matrix_path, overwrite=overwrite, image_key=original_image_key
Expand Down
2 changes: 1 addition & 1 deletion sopa/io/__init__.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
from .imaging import qptiff, macsima, phenocycler, hyperion, ome_tif
from .imaging import macsima, phenocycler, hyperion, ome_tif
from .explorer import write
from .standardize import write_standardized
from .transcriptomics import merscope, xenium, cosmx
Expand Down
Loading

0 comments on commit 67c877c

Please sign in to comment.