Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ValueError in DataJoint-Python 0.14.3 when using numpy 2.2.* #1201

Open
MilagrosMarin opened this issue Jan 22, 2025 · 0 comments
Open

ValueError in DataJoint-Python 0.14.3 when using numpy 2.2.* #1201

MilagrosMarin opened this issue Jan 22, 2025 · 0 comments
Assignees
Labels

Comments

@MilagrosMarin
Copy link

Bug Report

Description

Attempting to insert a file into a File part table as external storage triggers a ValueError due to incompatibility between DataJoint-Python (v0.14.3) and numpy (v2.2.*).
numpy 2.2.* raises an error for truth-testing on empty arrays, whereas earlier versions issued a DeprecationWarning. This affects any truth-testing of empty arrays within DataJoint-Python.

Reproducibility

The issue is reproducible during file insertion using the following definition:

class File(dj.Part):  
    definition = """  
    -> master  
    ---  
    file_name: varchar(255)  
    file: filepath@seq-raw  
    """  
  • OS: macOS Sequoia 15.2
  • Python Version: 3.11.10
  • DataJoint Version: 0.14.3
  • Minimum number of steps to reliably reproduce the issue:
    • Define a dj.Part table as shown above.
    • Attempt to insert a file using the insert1 method with numpy 2.2.* installed.
  • Complete error stack as a result of evaluating the above steps
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Cell In[38], line 8
      6 if gtf_file.is_file():
      7     file_info = dict(file_name="2021-04-23.mm10.ncbiRefSeq.gtf", file=gtf_file)
----> 8     rna_seq.GTFAnnotation.File.insert1(
      9         {**stored_ref_gen, **file_info}, skip_duplicates=True
     10     )

File ~/miniconda/envs/ucsf_cadwell-lab/lib/python3.11/site-packages/datajoint/table.py:347, in Table.insert1(self, row, **kwargs)
    340 def insert1(self, row, **kwargs):
    341     """
    342     Insert one data record into the table. For ``kwargs``, see ``insert()``.
    343 
    344     :param row: a numpy record, a dict-like object, or an ordered sequence to be inserted
    345         as one row.
    346     """
--> 347     self.insert((row,), **kwargs)

File ~/miniconda/envs/ucsf_cadwell-lab/lib/python3.11/site-packages/datajoint/table.py:430, in Table.insert(self, rows, replace, skip_duplicates, ignore_extra_fields, allow_direct_insert)
    428 # collects the field list from first row (passed by reference)
    429 field_list = []
--> 430 rows = list(
    431     self.__make_row_to_insert(row, field_list, ignore_extra_fields)
    432     for row in rows
    433 )
    434 if rows:
    435     try:

File ~/miniconda/envs/ucsf_cadwell-lab/lib/python3.11/site-packages/datajoint/table.py:431, in <genexpr>(.0)
    428 # collects the field list from first row (passed by reference)
    429 field_list = []
    430 rows = list(
--> 431     self.__make_row_to_insert(row, field_list, ignore_extra_fields)
    432     for row in rows
    433 )
    434 if rows:
    435     try:

File ~/miniconda/envs/ucsf_cadwell-lab/lib/python3.11/site-packages/datajoint/table.py:913, in Table.__make_row_to_insert(self, row, field_list, ignore_extra_fields)
    911 elif isinstance(row, collections.abc.Mapping):  # dict-based
    912     check_fields(row)
--> 913     attributes = [
    914         self.__make_placeholder(name, row[name], ignore_extra_fields)
    915         for name in self.heading
    916         if name in row
    917     ]
    918 else:  # positional
    919     try:

File ~/miniconda/envs/ucsf_cadwell-lab/lib/python3.11/site-packages/datajoint/table.py:914, in <listcomp>(.0)
    911 elif isinstance(row, collections.abc.Mapping):  # dict-based
    912     check_fields(row)
    913     attributes = [
--> 914         self.__make_placeholder(name, row[name], ignore_extra_fields)
    915         for name in self.heading
    916         if name in row
    917     ]
    918 else:  # positional
    919     try:

File ~/miniconda/envs/ucsf_cadwell-lab/lib/python3.11/site-packages/datajoint/table.py:873, in Table.__make_placeholder(self, name, value, ignore_extra_fields)
    867         value = (
    868             str.encode(attachment_path.name)
    869             + b"\0"
    870             + attachment_path.read_bytes()
    871         )
    872 elif attr.is_filepath:
--> 873     value = self.external[attr.store].upload_filepath(value).bytes
    874 elif attr.numeric:
    875     value = str(int(value) if isinstance(value, bool) else value)

File ~/miniconda/envs/ucsf_cadwell-lab/lib/python3.11/site-packages/datajoint/external.py:281, in ExternalTable.upload_filepath(self, local_filepath)
    279 # check if the remote file already exists and verify that it matches
    280 check_hash = (self & {"hash": uuid}).fetch("contents_hash")
--> 281 if check_hash:
    282     # the tracking entry exists, check that it's the same file as before
    283     if contents_hash != check_hash[0]:
    284         raise DataJointError(
    285             f"A different version of '{relative_filepath}' has already been placed."
    286         )

ValueError: The truth value of an empty array is ambiguous. Use `array.size > 0` to check that an array is not empty.

Expected Behavior

The file should be inserted into the table without error, as it works with numpy versions prior to 2.2.

Additional Research and Context

Solution

Patching DataJoint-Python to handle truth-testing of empty arrays correctly.

@dimitri-yatsenko dimitri-yatsenko self-assigned this Jan 22, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants