Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

KeyError(f"None of [{key}] are in the [{axis_name}]") using aemb #174

Open
michoug opened this issue Aug 28, 2024 · 3 comments
Open

KeyError(f"None of [{key}] are in the [{axis_name}]") using aemb #174

michoug opened this issue Aug 28, 2024 · 3 comments

Comments

@michoug
Copy link

michoug commented Aug 28, 2024

Hi

Using version 2.1.0, I followed the protocol for strobealign-aemb as such

SemiBin2 split_contigs -i test_orig.fasta -o test -m 1000

strobealign --aemb test/split_contigs.fna.gz R1.fq.gz R2.fq.gz -R 6 > test/cov1.txt did that for 6 paired reads

SemiBin2 single_easy_bin -i test/split_contigs.fna.gz -a test/*cov.txt -o test_results

and got this error

2024-08-28 09:35:25 jst369 SemiBin[271400] INFO Setting number of CPUs to 72
2024-08-28 09:35:25 jst369 SemiBin[271400] INFO Binning for short_read
2024-08-28 09:35:25 jst369 SemiBin[271400] INFO SemiBin will run in self supervised mode
2024-08-28 09:35:27 jst369 SemiBin[271400] INFO Did not detect GPU, using CPU.
2024-08-28 09:35:27 jst369 SemiBin[271400] INFO Generating training data...
2024-08-28 09:35:33 jst369 SemiBin[271400] INFO Reading abundance information from abundance files.
Traceback (most recent call last):
  File "/scratch/gmichoud/MAGsGeneration/snakemake_envs/4c2785baf7be87f610509e9210ba4903_/bin/SemiBin2", line 10, in <module>
    sys.exit(main2())
             ^^^^^^^
  File "/scratch/gmichoud/MAGsGeneration/snakemake_envs/4c2785baf7be87f610509e9210ba4903_/lib/python3.12/site-packages/SemiBin/main.py", line 1600, in main2
    single_easy_binning(
  File "/scratch/gmichoud/MAGsGeneration/snakemake_envs/4c2785baf7be87f610509e9210ba4903_/lib/python3.12/site-packages/SemiBin/main.py", line 1242, in single_easy_binning
    generate_sequence_features_single(
  File "/scratch/gmichoud/MAGsGeneration/snakemake_envs/4c2785baf7be87f610509e9210ba4903_/lib/python3.12/site-packages/SemiBin/main.py", line 868, in generate_sequence_features_single
    abun, abun_split = generate_cov_from_abundances(abundances, output, contig_fasta, binned_length)
                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/scratch/gmichoud/MAGsGeneration/snakemake_envs/4c2785baf7be87f610509e9210ba4903_/lib/python3.12/site-packages/SemiBin/generate_coverage.py", line 197, in generate_cov_from_abundances
    abun_split_data = abun_split_data.loc[binned_contig]
                      ~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^
  File "/scratch/gmichoud/MAGsGeneration/snakemake_envs/4c2785baf7be87f610509e9210ba4903_/lib/python3.12/site-packages/pandas/core/indexing.py", line 1191, in __getitem__
    return self._getitem_axis(maybe_callable, axis=axis)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/scratch/gmichoud/MAGsGeneration/snakemake_envs/4c2785baf7be87f610509e9210ba4903_/lib/python3.12/site-packages/pandas/core/indexing.py", line 1420, in _getitem_axis
    return self._getitem_iterable(key, axis=axis)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/scratch/gmichoud/MAGsGeneration/snakemake_envs/4c2785baf7be87f610509e9210ba4903_/lib/python3.12/site-packages/pandas/core/indexing.py", line 1360, in _getitem_iterable
    keyarr, indexer = self._get_listlike_indexer(key, axis)
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/scratch/gmichoud/MAGsGeneration/snakemake_envs/4c2785baf7be87f610509e9210ba4903_/lib/python3.12/site-packages/pandas/core/indexing.py", line 1558, in _get_listlike_indexer
    keyarr, indexer = ax._get_indexer_strict(key, axis_name)
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/scratch/gmichoud/MAGsGeneration/snakemake_envs/4c2785baf7be87f610509e9210ba4903_/lib/python3.12/site-packages/pandas/core/indexes/base.py", line 6200, in _get_indexer_strict
    self._raise_if_missing(keyarr, indexer, axis_name)
  File "/scratch/gmichoud/MAGsGeneration/snakemake_envs/4c2785baf7be87f610509e9210ba4903_/lib/python3.12/site-packages/pandas/core/indexes/base.py", line 6249, in _raise_if_missing
    raise KeyError(f"None of [{key}] are in the [{axis_name}]")
KeyError: "None of [Index(['iceDNA_Arolla_S42_k141_120429_1_1',\n       'iceDNA_Arolla_S42_k141_120429_1_2',\n       'iceDNA_Arolla_S42_k141_120429_2_1',\n       'iceDNA_Arolla_S42_k141_120429_2_2', 'iceDNA_Arolla_S42_k141_43045_1_1',\n       'iceDNA_Arolla_S42_k141_43045_1_2', 'iceDNA_Arolla_S42_k141_43045_2_1',\n       'iceDNA_Arolla_S42_k141_43045_2_2', 'iceDNA_Arolla_S42_k141_21518_1_1',\n       'iceDNA_Arolla_S42_k141_21518_1_2',\n       ...\n       'iceDNA_Arolla_S42_k141_49409_2_1', 'iceDNA_Arolla_S42_k141_49409_2_2',\n       'iceDNA_Arolla_S42_k141_27912_1_1', 'iceDNA_Arolla_S42_k141_27912_1_2',\n       'iceDNA_Arolla_S42_k141_27912_2_1', 'iceDNA_Arolla_S42_k141_27912_2_2',\n       'iceDNA_Arolla_S42_k141_23625_1_1', 'iceDNA_Arolla_S42_k141_23625_1_2',\n       'iceDNA_Arolla_S42_k141_23625_2_1', 'iceDNA_Arolla_S42_k141_23625_2_2'],\n      dtype='object', length=4260)] are in the [index]"

The name of the contigs in the split fasta file and the coverage file is as iceDNA_Arolla_S42_k141_120429_1 and not iceDNA_Arolla_S42_k141_120429_1_1.
If I don't split before, I get the same error
Any ideas?
Best
Greg

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants
@michoug and others