Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

'dask' and '!pyscenic ctx' #120

Closed
grimwoo opened this issue Dec 19, 2019 · 3 comments
Closed

'dask' and '!pyscenic ctx' #120

grimwoo opened this issue Dec 19, 2019 · 3 comments

Comments

@grimwoo
Copy link

grimwoo commented Dec 19, 2019

When I run

!pyscenic ctx adj.csv \
    {f_db_names} \
    --annotations_fname {f_motif_path} \
    --expression_mtx_fname {f_loom_path_unfilt} \
    --output Step6_reg.csv \
    --mask_dropouts \
    --num_workers 10

, it always gives me error.

As I read from a previous issue, this may be due to the module "dask". However, even though I tried half of historical versions of "dask", I still get the error information.

#Error information with latest "dask" version is as following:

[                                        ] | 0% Completed |  1min 24.9s
Traceback (most recent call last):
  File "/public-supool/home/wuhaoda/anaconda2/envs/Grim3.6.8/bin/pyscenic", line 11, in <module>
    sys.exit(main())
  File "/public-supool/home/wuhaoda/anaconda2/envs/Grim3.6.8/lib/python3.6/site-packages/pyscenic/cli/pyscenic.py", line 408, in main
    args.func(args)
  File "/public-supool/home/wuhaoda/anaconda2/envs/Grim3.6.8/lib/python3.6/site-packages/pyscenic/cli/pyscenic.py", line 159, in prune_targets_command
    num_workers=args.num_workers)
  File "/public-supool/home/wuhaoda/anaconda2/envs/Grim3.6.8/lib/python3.6/site-packages/pyscenic/prune.py", line 351, in prune2df
    num_workers, module_chunksize)
  File "/public-supool/home/wuhaoda/anaconda2/envs/Grim3.6.8/lib/python3.6/site-packages/pyscenic/prune.py", line 300, in _distributed_calc
    return create_graph().compute(scheduler='processes', num_workers=num_workers if num_workers else cpu_count())
  File "/public-supool/home/wuhaoda/anaconda2/envs/Grim3.6.8/lib/python3.6/site-packages/dask/base.py", line 165, in compute
    (result,) = compute(self, traverse=False, **kwargs)
  File "/public-supool/home/wuhaoda/anaconda2/envs/Grim3.6.8/lib/python3.6/site-packages/dask/base.py", line 436, in compute
    results = schedule(dsk, keys, **kwargs)
  File "/public-supool/home/wuhaoda/anaconda2/envs/Grim3.6.8/lib/python3.6/site-packages/dask/multiprocessing.py", line 215, in get
    **kwargs
  File "/public-supool/home/wuhaoda/anaconda2/envs/Grim3.6.8/lib/python3.6/site-packages/dask/local.py", line 486, in get_async
    raise_exception(exc, tb)
  File "/public-supool/home/wuhaoda/anaconda2/envs/Grim3.6.8/lib/python3.6/site-packages/dask/local.py", line 315, in reraise
    raise exc.with_traceback(tb)
  File "/public-supool/home/wuhaoda/anaconda2/envs/Grim3.6.8/lib/python3.6/site-packages/dask/local.py", line 222, in execute_task
    result = _execute_task(task, data)
  File "/public-supool/home/wuhaoda/anaconda2/envs/Grim3.6.8/lib/python3.6/site-packages/dask/core.py", line 119, in _execute_task
    return func(*args2)
  File "/public-supool/home/wuhaoda/anaconda2/envs/Grim3.6.8/lib/python3.6/site-packages/dask/dataframe/utils.py", line 653, in check_meta
    check_matching_columns(meta, x)
  File "/public-supool/home/wuhaoda/anaconda2/envs/Grim3.6.8/lib/python3.6/site-packages/dask/dataframe/utils.py", line 678, in check_matching_columns
    "  Missing: %s" % (extra, missing)
ValueError: The columns in the computed data do not match the columns in the provided metadata
  Extra:   []
  Missing: []

#when I install lower version of "dask", the error would be like this (only sometimes, it could show "from dask ...."):

  File "/public-supool/home/wuhaoda/anaconda2/envs/Grim3.6.8/lib/python3.6/site-packages/distributed/config.py", line 11, in <module>
    config = dask.config.config
AttributeError: module 'dask' has no attribute 'config'
@lucygarner
Copy link

Hi,

I am also getting this error.

Traceback (most recent call last):
  File "/data/user/lucy/py36-v1/conda-install/envs/pyscenic/bin/pyscenic", line 8, in <module>
    sys.exit(main())
  File "/data/user/lucy/py36-v1/conda-install/envs/pyscenic/lib/python3.8/site-packages/pyscenic/cli/pyscenic.py", line 420, in main
    args.func(args)
  File "/data/user/lucy/py36-v1/conda-install/envs/pyscenic/lib/python3.8/site-packages/pyscenic/cli/pyscenic.py", line 159, in prune_targets_command
    df_motifs = calc_func(dbs, modules, motif_annotations_fname,
  File "/data/user/lucy/py36-v1/conda-install/envs/pyscenic/lib/python3.8/site-packages/pyscenic/prune.py", line 349, in prune2df
    return _distributed_calc(rnkdbs, modules, motif_annotations_fname, transformation_func, aggregation_func,
  File "/data/user/lucy/py36-v1/conda-install/envs/pyscenic/lib/python3.8/site-packages/pyscenic/prune.py", line 300, in _distributed_calc
    return create_graph().compute(scheduler='processes', num_workers=num_workers if num_workers else cpu_count())
  File "/data/user/lucy/py36-v1/conda-install/envs/pyscenic/lib/python3.8/site-packages/dask/base.py", line 166, in compute
    (result,) = compute(self, traverse=False, **kwargs)
  File "/data/user/lucy/py36-v1/conda-install/envs/pyscenic/lib/python3.8/site-packages/dask/base.py", line 444, in compute
    results = schedule(dsk, keys, **kwargs)
  File "/data/user/lucy/py36-v1/conda-install/envs/pyscenic/lib/python3.8/site-packages/dask/multiprocessing.py", line 208, in get
    result = get_async(
  File "/data/user/lucy/py36-v1/conda-install/envs/pyscenic/lib/python3.8/site-packages/dask/local.py", line 486, in get_async
    raise_exception(exc, tb)
  File "/data/user/lucy/py36-v1/conda-install/envs/pyscenic/lib/python3.8/site-packages/dask/local.py", line 316, in reraise
    raise exc
  File "/data/user/lucy/py36-v1/conda-install/envs/pyscenic/lib/python3.8/site-packages/dask/local.py", line 222, in execute_task
    result = _execute_task(task, data)
  File "/data/user/lucy/py36-v1/conda-install/envs/pyscenic/lib/python3.8/site-packages/dask/core.py", line 121, in _execute_task
    return func(*(_execute_task(a, cache) for a in args))
  File "/data/user/lucy/py36-v1/conda-install/envs/pyscenic/lib/python3.8/site-packages/dask/dataframe/utils.py", line 655, in check_meta
    check_matching_columns(meta, x)
  File "/data/user/lucy/py36-v1/conda-install/envs/pyscenic/lib/python3.8/site-packages/dask/dataframe/utils.py", line 680, in check_matching_columns
    raise ValueError(
ValueError: The columns in the computed data do not match the columns in the provided metadata
Order of columns does not match	

Was there a recommended solution?

Best,
Lucy

@cflerin
Copy link
Contributor

cflerin commented May 18, 2020

Hi @grimwoo , @lc822 , sorry for not replying to this earlier. This is a common issue with Dask versions. You can find suggestions to fix this in #163

@cflerin cflerin closed this as completed May 18, 2020
@BioAIEvolu
Copy link

Hi,

I am also getting this error.

Traceback (most recent call last):
  File "/data/user/lucy/py36-v1/conda-install/envs/pyscenic/bin/pyscenic", line 8, in <module>
    sys.exit(main())
  File "/data/user/lucy/py36-v1/conda-install/envs/pyscenic/lib/python3.8/site-packages/pyscenic/cli/pyscenic.py", line 420, in main
    args.func(args)
  File "/data/user/lucy/py36-v1/conda-install/envs/pyscenic/lib/python3.8/site-packages/pyscenic/cli/pyscenic.py", line 159, in prune_targets_command
    df_motifs = calc_func(dbs, modules, motif_annotations_fname,
  File "/data/user/lucy/py36-v1/conda-install/envs/pyscenic/lib/python3.8/site-packages/pyscenic/prune.py", line 349, in prune2df
    return _distributed_calc(rnkdbs, modules, motif_annotations_fname, transformation_func, aggregation_func,
  File "/data/user/lucy/py36-v1/conda-install/envs/pyscenic/lib/python3.8/site-packages/pyscenic/prune.py", line 300, in _distributed_calc
    return create_graph().compute(scheduler='processes', num_workers=num_workers if num_workers else cpu_count())
  File "/data/user/lucy/py36-v1/conda-install/envs/pyscenic/lib/python3.8/site-packages/dask/base.py", line 166, in compute
    (result,) = compute(self, traverse=False, **kwargs)
  File "/data/user/lucy/py36-v1/conda-install/envs/pyscenic/lib/python3.8/site-packages/dask/base.py", line 444, in compute
    results = schedule(dsk, keys, **kwargs)
  File "/data/user/lucy/py36-v1/conda-install/envs/pyscenic/lib/python3.8/site-packages/dask/multiprocessing.py", line 208, in get
    result = get_async(
  File "/data/user/lucy/py36-v1/conda-install/envs/pyscenic/lib/python3.8/site-packages/dask/local.py", line 486, in get_async
    raise_exception(exc, tb)
  File "/data/user/lucy/py36-v1/conda-install/envs/pyscenic/lib/python3.8/site-packages/dask/local.py", line 316, in reraise
    raise exc
  File "/data/user/lucy/py36-v1/conda-install/envs/pyscenic/lib/python3.8/site-packages/dask/local.py", line 222, in execute_task
    result = _execute_task(task, data)
  File "/data/user/lucy/py36-v1/conda-install/envs/pyscenic/lib/python3.8/site-packages/dask/core.py", line 121, in _execute_task
    return func(*(_execute_task(a, cache) for a in args))
  File "/data/user/lucy/py36-v1/conda-install/envs/pyscenic/lib/python3.8/site-packages/dask/dataframe/utils.py", line 655, in check_meta
    check_matching_columns(meta, x)
  File "/data/user/lucy/py36-v1/conda-install/envs/pyscenic/lib/python3.8/site-packages/dask/dataframe/utils.py", line 680, in check_matching_columns
    raise ValueError(
ValueError: The columns in the computed data do not match the columns in the provided metadata
Order of columns does not match	

Was there a recommended solution?

Best,
Lucy

Same problem here +1 👍

the package version is:

scanpy==1.4.4.post1 anndata==0.6.22.post1 umap==0.4.3 numpy==1.17.4 scipy==1.4.1 pandas==0.25.3 scikit-learn==0.23.1 statsmodels==0.11.1 pyscenic==0.10.0 dask=='2.17.2' distributed==2.11.0 pandas=='0.25.3' 

I have try pip install dask==1.0.0 distributed'>=1.21.6,<2.0.0' as the #163 said. but it doesn't work for me.

Then,I try the lastest version and the old version of pandas,just get the same error:

2020-06-04 11:25:13,096 - pyscenic.cli.pyscenic - INFO - Loading databases.

2020-06-04 11:25:13,096 - pyscenic.cli.pyscenic - INFO - Calculating regulons.
[                                        ] | 0% Completed | 13.7s
(omit)
 File "/home/miniconda3/envs/ScCancer/lib/python3.8/site-packages/dask/dataframe/utils.py", line 680, in check_matching_columns
    raise ValueError(
ValueError: The columns in the computed data do not match the columns in the provided metadata
Order of columns does not match

2020-06-04 11:26:18,390 - pyscenic.transform - WARNING - Less than 80% of the genes in Regulon for ZNF486 could be mapped to hg19-tss-centered-10kb-10species.mc9nr. Skipping this module.

2020-06-04 11:26:20,011 - pyscenic.transform - WARNING - Less than 80% of the genes in Regulon for ZNF492 could be mapped to hg19-tss-centered-10kb-10species.mc9nr. Skipping this module.

Is my Python version too new?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants