Skip to content

Commit

Permalink
change cbiohub doc formatting
Browse files Browse the repository at this point in the history
  • Loading branch information
inodb authored Jan 1, 2025
1 parent 6bb5ae3 commit 5aa92f3
Showing 1 changed file with 3 additions and 3 deletions.
6 changes: 3 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,14 +2,14 @@

**WARNING ⚠️: This package is still under construction.**

`cbiohub` is a Python package and CLI tool designed to simplify the analysis of data from cBioPortal, including those hosted on the [cBioPortal Datahub](github.com/cbioPortal/datahub). Unlike existing API clients, which focus on slices of data via the REST API, `cbiohub` supports bulk analysis of harmonized datasets. By using combined parquet files instead of per-study CSV/TSV files, it enables faster data loading and querying.
cbiohub is a Python package and CLI tool designed to simplify the analysis of data from cBioPortal, including those hosted on the [cBioPortal Datahub](github.com/cbioPortal/datahub). Unlike existing API clients, which focus on slices of data via the REST API, cbiohub supports bulk analysis of harmonized datasets. By using combined parquet files instead of per-study CSV/TSV files, it enables faster data loading and querying.

`cbiohub` features:
cbiohub features:

- A **data module** for ingesting and converting cBioPortal Datahub files into parquet format
- An **analysis module** leveraging DuckDB for efficient local data exploration

With parquet’s widespread compatibility, `cbiohub` allows seamless integration with other programming languages and data warehousing tools.
With parquet’s widespread compatibility, cbiohub allows seamless integration with other programming languages and data warehousing tools.

<img width="714" alt="image" src="https://github.com/user-attachments/assets/9a1c9a79-7336-49ce-89b1-43c5b614f0ea" />

Expand Down

0 comments on commit 5aa92f3

Please sign in to comment.