Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for tables with primary keys not named id #11

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

WisdomWolf
Copy link

No description provided.

@somakrdas somakrdas self-requested a review January 17, 2021 22:28
Copy link
Contributor

@somakrdas somakrdas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @WisdomWolf, thanks for your contribution! I enjoyed reviewing it and left my comments below. Sorry for my delay.

Would you be up for sharing the column name that motivated this change?
I'd love to learn about other conventions and use cases.

BTW, I wanted to offer - I'm happy to help with implementing some of these comments if you'd like. Let me know if interested.

I've also added @vineetgopal as a secondary PR reviewer. @vineetgopal, could you take a look after I approve on my end?

@@ -4,7 +4,7 @@
import sqlalchemy


__version__ = "0.0.4"
__version__ = "0.0.5"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This brought up a good discussion on if we should separate functional and version changes. Currently our answer is yes, let's separate functional from version changes. That way, we can separately decide how to bump the version.

Accordingly, could you revert this line?

Suggested change
__version__ = "0.0.5"

After this merges, I will (1) bump this and setup.py to a new version and (2) release.

@@ -19,7 +19,7 @@ def _get_column_python_type(column):


def _has_normal_id_primary_key(base_mapper):
"""Check if the primary key for base_mapper is an auto-incrementing integer `id` column"""
"""Check if the primary key for base_mapper is an auto-incrementing integer column"""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 - could you generalize the remaining instances of id in this module, like symbol names and comments?

For example, _has_normal_id_primary_key_has_normal_primary_key.

Hoping to generalize all instances of id except any examples that should remain as-is.

Comment on lines +35 to +37
python_column_type == int
and primary_key_col.autoincrement in ("auto", True)
and primary_key_col.table == base_mapper.local_table
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you follow the single-indent style from before?

Suggested change
python_column_type == int
and primary_key_col.autoincrement in ("auto", True)
and primary_key_col.table == base_mapper.local_table
python_column_type == int
and primary_key_col.autoincrement in ("auto", True)
and primary_key_col.table == base_mapper.local_table

We should eventually set up a code formatter for this codebase to automate.

and primary_key_col.autoincrement in ("auto", True)
and primary_key_col.table == base_mapper.local_table
python_column_type == int
and primary_key_col.autoincrement in ("auto", True)
Copy link
Contributor

@somakrdas somakrdas Jan 22, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I noticed an existing issue that _has_normal_id_primary_key returns true for primary columns that are not auto-incremented.

For example, in my company's codebase, we have Source table with a destination_id pkey which is a fkey to Destination.id. _has_normal_id_primary_key for this table returns true, even though it is not auto-incremented.

Notably, primary_key_col.autoincrement is auto so it passes this check, even though ultimately SQLAlchemy doesn't auto-increment this pkey. https://docs.sqlalchemy.org/en/13/core/metadata.html?highlight=autoincrement#sqlalchemy.schema.Column.params.autoincrement explains that auto only results in auto-increment if

  • Column is integer derived - checked on L35
  • Column is pkey - checked on L23
  • Column is not fkey - not checked - could check by adding not primary_key_col.foreign_keys to this check

Even though this is an existing issue, for that codebase, it started appearing with non-id pkeys, so I think it's worth fixing in this PR.
I'm curious what is the best way to check that primary_key_col is auto-incremented with SQLA. Some ideas:

Open to others! CC: @vineetgopal

)


def _get_primary_key_name(base_mapper):
primary_key_cols = base_mapper.primary_key
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I noticed an existing issue that the module assumes the attribute name and column name are the same, but there could be exceptions to that if a caller defines attribute_name = db.Column("column_name", db.Integer, primary_key=True). This could've affected an id column name and could come up with non-id pkey column names.
My company's codebase has some columns like these, though none are pkeys, so I'm not sure how often it comes up in practice. So, to make the assumption clear, could we assert that the attribute name and column name are the same?

Here's a start, but open to simpler implementations:

[primary_key_col_attr] = [c for c in base_mapper.column_attrs if primary_key_col in c.columns]
assert primary_key_col.name == primary_key_col_attr.key, \
    "_get_primary_key_name only supports columns whose name equals their attribute name"

assert _has_normal_id_primary_key(base_mapper), \
"_get_id_sequence_name only supports single auto increment primary keys"
primary_key = _get_primary_key_name(base_mapper)
return f"{base_mapper.entity.__tablename__}_{primary_key}_seq"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This brought up a good discussion on how backward compatible we should be with Python 2. For this release, our answer is yes, let's remain backward compatible.

Accordingly, could you change to the old-style string formatting from before?

Suggested change
return f"{base_mapper.entity.__tablename__}_{primary_key}_seq"
"%s_%s_seq" % (base_mapper.entity.__tablename__, primary_key)

raise AssertionError("Expected models to have auto-incrementing `id` primary key")
raise AssertionError("Expected models to have auto-incrementing primary key")

primary_key = _get_primary_key_name(base_mapper)
Copy link
Contributor

@somakrdas somakrdas Jan 22, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you rename to primary_key_name to distinguish this pkey name (like the _get_primary_key_name function does) from other possible variables like pkey column (primary_key_col) or value?

Suggested change
primary_key = _get_primary_key_name(base_mapper)
primary_key_name = _get_primary_key_name(base_mapper)

@somakrdas somakrdas requested a review from vineetgopal January 22, 2021 04:35
@WisdomWolf
Copy link
Author

Sorry it's taken me a while to get back to this, been pretty busy recently. I should be able to address most of these requests, hopefully in the next few days.

Copy link
Contributor

@vineetgopal vineetgopal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks WisdomWolf for putting this up! I'm requesting changes for now as you're changing things re: @somakrdas's comments, but happy to review again afterwards.

@somakrdas
Copy link
Contributor

Hi @WisdomWolf, wanted to share an update with you on the SQLAlchemy side - SQLAlchemy now supports "batch inserts" (with retrieval of newly generated primary key values) natively in version 1.4: https://groups.google.com/g/sqlalchemy/c/GyAZTThJi2I/m/k14nv8HzEgAJ , https://docs.sqlalchemy.org/en/14/changelog/migration_14.html#psycopg2-dialect-features-execute-values-with-returning-for-insert-statements-by-default

@vineetgopal and I will discuss this more on our end for sqlalchemy_batch_inserts generally, but I wanted to share this with you (and anyone following along) just in case it affects your plans for this PR, since upgrading SQLAlchemy to 1.4 may handle your use case.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants