Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Low verbosity error 1217 on delete prevents cascade #638

Open
samuelbray32 opened this issue Aug 31, 2023 · 32 comments
Open

Low verbosity error 1217 on delete prevents cascade #638

samuelbray32 opened this issue Aug 31, 2023 · 32 comments
Assignees
Labels
Database Issues with Frank Lab database, not Spyglass code

Comments

@samuelbray32
Copy link
Collaborator

samuelbray32 commented Aug 31, 2023

Describe the bug
Attempting to delete an entry within the Session table results in the following error: IntegrityError: (1217, 'Cannot delete or update a parent row: a foreign key constraint fails')

To Reproduce
Steps to reproduce the behavior:

  1. Run (Session & {"nwb_file_name": 'samtest20230817_.nwb'}).delete({"nwb_file_name": 'samtest20230817_.nwb'})

Expected behavior
The entry and all other downstream should be removed

Additional context
The only thing done with this nwb_file was insertion into spyglass, if that helps narrow down potential schema permission errors

@edeno
Copy link
Collaborator

edeno commented Aug 31, 2023

For reference, here are the key restraints:
image

image

@khl02007
Copy link
Collaborator

@samuelbray32 this probably won't solve it but did you try not passing in the dictionary to delete? i.e. (Session & {"nwb_file_name": 'samtest20230817_.nwb'}).delete() instead of ((Session & {"nwb_file_name": 'samtest20230817_.nwb'}).delete({"nwb_file_name": 'samtest20230817_.nwb'}). And I thought delete only gets rid of the downstream entries -- the upstream entries (e.g. those in Nwbfile) will need to be deleted manually

@edeno
Copy link
Collaborator

edeno commented Aug 31, 2023

I was able to delete Sharon's file when she had this issue. So one possibility is that this is a permissions issue.

The other thought is that experiment_description is varchar(2000), which is a problem with the current MySQL. The permissions issue seems more likely though.

@samuelbray32
Copy link
Collaborator Author

@samuelbray32 this probably won't solve it but did you try not passing in the dictionary to delete? i.e. (Session & {"nwb_file_name": 'samtest20230817_.nwb'}).delete() instead of ((Session & {"nwb_file_name": 'samtest20230817_.nwb'}).delete({"nwb_file_name": 'samtest20230817_.nwb'}). And I thought delete only gets rid of the downstream entries -- the upstream entries (e.g. those in Nwbfile) will need to be deleted manually

The alternative delete call also didn't work. Attempting to delete from Nwbfile caused the same error. Also, I mistyped in the initial description when I said upstream. I've edited that now

@edeno
Copy link
Collaborator

edeno commented Aug 31, 2023

My current hypothesis is that it is the alison_ prefixed tables that are causing the issue. The ms_ tables are also maybe problematic for other users, but @samuelbray32 has permissions for these tables.

@acomrie
Copy link
Collaborator

acomrie commented Aug 31, 2023

Hi, in case useful, I haven't populated these alison schema or interacted with them at all in this timeframe, so it would have to be something about how they're instantiated I suppose (there are likely some dependencies on Session as expected, but I'd expect others have custom schema that depend on Session as well)

@edeno
Copy link
Collaborator

edeno commented Sep 2, 2023

To test the permissions part, @samuelbray32 can you reinsert your test file and try to delete it? Upon failure, I think @acomrie can try to delete it, which should work.

@edeno
Copy link
Collaborator

edeno commented Sep 3, 2023

Possibly caused by #641

@samuelbray32
Copy link
Collaborator Author

There's another entry samtestb20230817_.nwb with the same issue can test on

@samuelbray32 samuelbray32 self-assigned this Sep 18, 2023
@samuelbray32
Copy link
Collaborator Author

Created issue for datajoint here

@shijiegu
Copy link

shijiegu commented Oct 9, 2023

Hi everyone. I might have a similar issue here. I have some data processed from last year 2022. I cannot open these data now. Relevant files:

position_info = (IntervalLinearizedPosition() &
                 {'nwb_file_name': 'molly20220416_.nwb','interval_list_name': 'pos 1 valid times',
                  'position_info_param_name': 'default'}
                ).fetch1_dataframe()

This works for data processed this year.

position_info = (IntervalLinearizedPosition() &
                 {'nwb_file_name': 'eliot20221016_.nwb','interval_list_name': 'pos 1 valid times',
                  'position_info_param_name': 'default'}
                ).fetch1_dataframe()

If you try to delete old entries in the IntervalLinearizedPosition(), you get the same error which is (1217, 'Cannot delete or update a parent row: a foreign key constraint fails')

@CBroz1
Copy link
Member

CBroz1 commented Oct 9, 2023

Hi @shijiegu - Are you seeing Error 1217 with fetch1_dataframe()? I would only expect to see this when deleting data. I'm able to run your first command just fine
image

@edeno
Copy link
Collaborator

edeno commented Oct 9, 2023

@shijiegu how updated is your spyglass?

@shijiegu
Copy link

shijiegu commented Oct 9, 2023

@edeno I think you might be right. I only updated Spyglass this year in June...

@edeno edeno added the bug Something isn't working label Oct 19, 2023
@edeno
Copy link
Collaborator

edeno commented Oct 30, 2023

@CBroz1's PR will hopefully fix this in datajoint when it is merged: datajoint/datajoint-python#1112

@CBroz1
Copy link
Member

CBroz1 commented Nov 7, 2023

Closing with datajoint/datajoint-python#1112

@CBroz1 CBroz1 closed this as completed Nov 7, 2023
@CBroz1
Copy link
Member

CBroz1 commented Nov 9, 2023

(Session & restriction).delete() is still an issue for @sharon-chiang. In her case, it is a permission error and not a mysql 8 error code issue. I think this is solvable when we tackle the spike-sorting unix user group issue

@CBroz1 CBroz1 reopened this Nov 9, 2023
@edeno
Copy link
Collaborator

edeno commented Nov 9, 2023

Could you give a little more context for what the error was?

@CBroz1 CBroz1 added infrastructure Unix, MySQL, etc. settings/issues impacting users and removed bug Something isn't working labels Nov 9, 2023
@sharon-chiang
Copy link
Contributor

The error was the same 1217 integrity error as previously encountered. This occurred for (sgc.Session() & {'nwb_file_name': 'SC3820230615_.nwb'}).delete() after pulling the most recent updates from datajoint.

@edeno
Copy link
Collaborator

edeno commented Dec 18, 2023

Just to update from discussions with @CBroz1:

This issue should not affect people newly setting up spyglass so far as we know. Currently only a problem with the Frank Lab database.

@CBroz1
Copy link
Member

CBroz1 commented Dec 18, 2023

I've (a) reached out to our db admin to request that logs be turned on to better monitor occurrences, (b) reached out to a member of the datajoint team for insights, and (c) posted to stack overflow for community input

@CBroz1
Copy link
Member

CBroz1 commented Feb 28, 2024

I have continued to pursue solutions as described in post here.

@CBroz1
Copy link
Member

CBroz1 commented Mar 25, 2024

This list details which tables do and do not permit cascading deletes for low-privilege users. Attempting to delete from each table resulted in one of three outcomes...

  • deleted - A delete from this table works fine
  • verbose - A delete from this table returns the error DataJoint needs to continue the cascade, directing to the blocking table
  • problem- A delete from this table points to the blocking table for high- but not low-privilege users
SCHEMA TABLE NAME RESULT Blocking Table
common_nwbfile nwbfile PROBLEM _session
common_session _session PROBLEM position_source
common_behav position_source verbose raw_position
common_behav _raw_position PROBLEM _raw_position__pos_object
common_behav _raw_position__pos_object deleted
common_behav position_source__spatial_series verbose raw_position__pos_object
common_dio _d_i_o_events deleted
common_ephys _electrode_group PROBLEM _electrode
common_ephys _electrode deleted
common_ephys _raw deleted
common_ephys _sample_count deleted
common_interval interval_list PROBLEM _position_interval_map
common_behav __position_interval_map deleted
common_task _task_epoch PROBLEM _video_file
common_behav _video_file deleted
common_session _session__data_acquisition_device deleted

@edeno edit formatting for table

@CBroz1 CBroz1 changed the title Session deletion error Low verbosity error 1217 on delete prevents cascade Apr 24, 2024
@CBroz1 CBroz1 added Database Issues with Frank Lab database, not Spyglass code and removed bug Something isn't working infrastructure Unix, MySQL, etc. settings/issues impacting users labels May 6, 2024
@shijiegu
Copy link

shijiegu commented May 9, 2024

This issue persists:

(LFPSelection() & {'nwb_file_name' :'eliot20221025_.nwb'}).delete()

gives:

IntegrityError: (1217, 'Cannot delete or update a parent row: a foreign key constraint fails')

@shijiegu
Copy link

shijiegu commented May 9, 2024

Hi! I see the various attempts above, but at the end of the day, if a researcher wants to delete an entry, they are supposed to be able to do it. Now I need to traverse all children tables to find a potential blocking table myself.
This error was not there, at least for me, prior to 2023 Jun. This is caused by changes/upgrades in the database; I expect to receive a concrete fix for this, i.e., some scripts or an alternate delete function that I can call.

@samuelbray32
Copy link
Collaborator Author

As we've discussed in lab meetings, this is a permissions issue specific to our implemented database and we have collectively agreed on the working solution of escalating a user's permissions as needed.

If you are experiencing this error, please reach out the the database admins with the username(s) that are experiencing the issue to have their permissions altered

@shijiegu
Copy link

shijiegu commented May 9, 2024

Thanks Sam. I vaguely remember this discussion, spanning multiple lab meetings. I also recall a suggestion to implement a special delete function to call in an event like this.

I also need to point out no one has perfect memory on every issue of the database. A written resolution on how to proceed temporarily, especially in Open Issues, will be very helpful.

@shijiegu
Copy link

shijiegu commented May 9, 2024

As a second note, the scattered comments above do not fully explain why I need to traverse children tables because the principle of the datajoint is such that children tables will be subject to parent tables.

I could guess that a downstream table entry privilege is different from any of its parent because it is a more precious table, like curated spike sorting, but the reason is not fully split out. To better help users fix their own problems, some summarized knowledge on this kind of long and open issue will be helpful too.

@edeno
Copy link
Collaborator

edeno commented May 9, 2024

Hi @shijiegu, please review the Code of Conduct and come talk to me.

@CBroz1
Copy link
Member

CBroz1 commented Jun 25, 2024

In a further attempt to explore this issue, I dropped all schemas on our development server, declared empty tables with datajoint, and then loaded a mysqldump of common_. The verbosity issue persists.

This dump...

  1. had --skip-add-drop-table to avoid dropping tables during the load
  2. had edits to CREATE TABLE to include IF NOT EXISTS
  3. had edits to INSERT to ignore duplicates, specifically for conflicting log lines related to declaration
script
#!/bin/bash

# MySQL credentials
USER="cbroz"
PROD_PASS="redacted"
PROD_HOST="redacted"
DEV_PASS="redacted"
DEV_HOST="redacted"
OUTPUT_DIR="/home/cbroz/wrk/alt/common_dump"
DATABASE_LIST="temp-dump-list.txt"
DUMP_FILE="all_common_databases.sql"

mkdir -p "$OUTPUT_DIR" # Create output directory if it doesn't exist
DATABASES=$(tr '\n' ' ' < "$DATABASE_LIST") # Load text list of databases

echo "Dumping databases: $DATABASES"
# dump all, skipping drop and create info
mysqldump \
  -u "$USER" -p"$PROD_PASS" -h "$PROD_HOST" \
  --skip-add-drop-table \
  --databases $DATABASES > "$OUTPUT_DIR/$DUMP_FILE"

if [ $? -eq 0 ]; then
    echo "Successfully dumped databases: $DATABASES"
else
    echo "Error dumping databases: $DATABASES"
fi

sed -i 's/CREATE TABLE /CREATE TABLE IF NOT EXISTS /g' "$OUTPUT_DIR/$DUMP_FILE"
sed -i 's/INSERT INTO /INSERT IGNORE INTO /g' "$OUTPUT_DIR/$DUMP_FILE"

echo "Loading databases from $DUMP_FILE"
mysql\
  -u "$USER" -p"$DEV_PASS" -h "$DEV_HOST" \
  < "$OUTPUT_DIR/$DUMP_FILE"

@CBroz1
Copy link
Member

CBroz1 commented Sep 27, 2024

This script captures which tables have the greatest discrepancy between allocated memory and varchar usage

Summary
from functools import cached_property

import datajoint as dj
from tqdm import tqdm

from spyglass.utils.database_settings import SHARED_MODULES

schema = dj.schema("cbroz_temp")


@schema
class AllTables(dj.Manual):
    definition = """
    table : varchar(255)
    """

    @cached_property
    def all_tables(self):
        shared_schemas = [
            s for s in dj.list_schemas() if s.split("_")[0] in SHARED_MODULES
        ]
        all_tables = []
        for schema in shared_schemas:
            print(f"Checking schema: {schema}")
            schema_tables = dj.Schema(schema).list_tables()
            all_tables.extend([f"{schema}.{table}" for table in schema_tables])
        return all_tables

    def process(self):
        processed = self.fetch("table")
        inserts = []
        for table in tqdm(self.all_tables, desc="Processing tables"):
            if table in processed:
                continue
            tqdm.write(f"Inserting table: {table}")
            inserts.append({"table": table})
        self.insert(inserts)


@schema
class KeyLenChecker(dj.Computed):
    definition = """
    -> AllTables
    field='': varchar(255)
    ---
    alloc=0 : int
    max=0 : int
    """

    ft_cache = {}

    def get_ft(self, table):
        if table not in self.ft_cache:
            self.ft_cache[table] = dj.FreeTable(dj.conn(), table)
        return self.ft_cache[table]

    def make(self, key):
        table = key["table"]
        if table not in self.ft_cache:
            self.ft_cache[table] = dj.FreeTable(dj.conn(), table)
        ft = self.ft_cache[table]

        parent_fks = []  # Reduce duplicate fields
        for parent in ft.parents(as_objects=True):
            parent_fks.extend(parent.primary_key)

        alloc = {  # Get allocated space for varchar fields
            k: v.type.split("(")[1].split(")")[0]
            for k, v in ft.heading.attributes.items()
            if v.in_key and k not in parent_fks and "varchar" in v.type
        }

        if not alloc:
            self.insert1({"table": table})
            return

        try:
            max_lens = (
                dj.U()
                .aggr(ft, **{k: f"MAX(CHAR_LENGTH({k}))" for k in alloc.keys()})
                .fetch1()
            )
        except Exception as e:
            print(f"Error: {e}")
            return

        self.insert(
            [
                {"table": table, "field": k, "alloc": v, "max": max_lens[k]}
                for k, v in alloc.items()
            ]
        )


if __name__ == "__main__":
    # AllTables().process() # uncomment if new tables are added
    kl = KeyLenChecker()
    kl.populate(display_progress=True)
    print(
        (kl & 'max > 1').proj("max", diff="alloc-max")
        & dj.Top(limit=20, order_by="diff DESC")
    )

@samuelbray32
Copy link
Collaborator Author

A note for future debugging:

Updates in datajoint = 1.14.3 prevent the 1217 error code from being raised in this case. The diagnostic error output is now:

AttributeError: 'NoneType' object has no attribute 'groupdict'"
Full error stack
IntegrityError                            Traceback (most recent call last)
File ~/miniforge3/envs/spyglass_sort/lib/python3.9/site-packages/datajoint/table.py:519, in Table.delete.<locals>.cascade(table)
    518 try:
--> 519     delete_count = table.delete_quick(get_count=True)
    520 except IntegrityError as error:

File ~/miniforge3/envs/spyglass_sort/lib/python3.9/site-packages/datajoint/table.py:474, in Table.delete_quick(self, get_count)
    473 query = \"DELETE FROM \" + self.full_table_name + self.where_clause()
--> 474 self.connection.query(query)
    475 count = (
    476     self.connection.query(\"SELECT ROW_COUNT()\").fetchone()[0]
    477     if get_count
    478     else None
    479 )

File ~/miniforge3/envs/spyglass_sort/lib/python3.9/site-packages/datajoint/connection.py:343, in Connection.query(self, query, args, as_dict, suppress_warnings, reconnect)
    342 try:
--> 343     self._execute_query(cursor, query, args, suppress_warnings)
    344 except errors.LostConnectionError:

File ~/miniforge3/envs/spyglass_sort/lib/python3.9/site-packages/datajoint/connection.py:299, in Connection._execute_query(cursor, query, args, suppress_warnings)
    298 except client.err.Error as err:
--> 299     raise translate_query_error(err, query)

IntegrityError: Cannot delete or update a parent row: a foreign key constraint fails

During handling of the above exception, another exception occurred:

AttributeError                            Traceback (most recent call last)
Cell In[29], line 1
----> 1 (CuratedSpikeSorting & auto_curation_out_key).cautious_delete()

File ~/Src/spyglass/src/spyglass/utils/dj_mixin.py:503, in SpyglassMixin.cautious_delete(self, force_permission, dry_run, *args, **kwargs)
    496 if dry_run:
    497     return (
    498         IntervalList(),  # cleanup func relies on downstream deletes
    499         external[\"raw\"].unused(),
    500         external[\"analysis\"].unused(),
    501     )
--> 503 super().delete(*args, **kwargs)  # Confirmation here
    505 for ext_type in [\"raw\", \"analysis\"]:
    506     external[ext_type].delete(
    507         delete_external_files=True, display_progress=False
    508     )

File ~/miniforge3/envs/spyglass_sort/lib/python3.9/site-packages/datajoint/table.py:623, in Table.delete(self, transaction, safemode, force_parts, force_masters)
    621 # Cascading delete
    622 try:
--> 623     delete_count = cascade(self)
    624 except:
    625     if transaction:

File ~/miniforge3/envs/spyglass_sort/lib/python3.9/site-packages/datajoint/table.py:521, in Table.delete.<locals>.cascade(table)
    519     delete_count = table.delete_quick(get_count=True)
    520 except IntegrityError as error:
--> 521     match = foreign_key_error_regexp.match(error.args[0]).groupdict()
    522     # if schema name missing, use table
    523     if \"`.`\" not in match[\"child\"]:

AttributeError: 'NoneType' object has no attribute 'groupdict'"

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Database Issues with Frank Lab database, not Spyglass code
Projects
None yet
Development

No branches or pull requests

7 participants