Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for tables with primary keys not named id #11

Open
wants to merge 2 commits into
base: master
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
31 changes: 20 additions & 11 deletions sqlalchemy_batch_inserts/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@
import sqlalchemy


__version__ = "0.0.4"
__version__ = "0.0.5"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This brought up a good discussion on if we should separate functional and version changes. Currently our answer is yes, let's separate functional from version changes. That way, we can separately decide how to bump the version.

Accordingly, could you revert this line?

Suggested change
__version__ = "0.0.5"

After this merges, I will (1) bump this and setup.py to a new version and (2) release.



def _group_models_by_base_mapper(initial_models):
Expand All @@ -19,7 +19,7 @@ def _get_column_python_type(column):


def _has_normal_id_primary_key(base_mapper):
"""Check if the primary key for base_mapper is an auto-incrementing integer `id` column"""
"""Check if the primary key for base_mapper is an auto-incrementing integer column"""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 - could you generalize the remaining instances of id in this module, like symbol names and comments?

For example, _has_normal_id_primary_key_has_normal_primary_key.

Hoping to generalize all instances of id except any examples that should remain as-is.

primary_key_cols = base_mapper.primary_key
if len(primary_key_cols) != 1:
return False
Expand All @@ -32,16 +32,23 @@ def _has_normal_id_primary_key(base_mapper):
python_column_type = None

return (
primary_key_col.name == "id"
and python_column_type == int
and primary_key_col.autoincrement in ("auto", True)
and primary_key_col.table == base_mapper.local_table
python_column_type == int
and primary_key_col.autoincrement in ("auto", True)
Copy link
Contributor

@somakrdas somakrdas Jan 22, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I noticed an existing issue that _has_normal_id_primary_key returns true for primary columns that are not auto-incremented.

For example, in my company's codebase, we have Source table with a destination_id pkey which is a fkey to Destination.id. _has_normal_id_primary_key for this table returns true, even though it is not auto-incremented.

Notably, primary_key_col.autoincrement is auto so it passes this check, even though ultimately SQLAlchemy doesn't auto-increment this pkey. https://docs.sqlalchemy.org/en/13/core/metadata.html?highlight=autoincrement#sqlalchemy.schema.Column.params.autoincrement explains that auto only results in auto-increment if

  • Column is integer derived - checked on L35
  • Column is pkey - checked on L23
  • Column is not fkey - not checked - could check by adding not primary_key_col.foreign_keys to this check

Even though this is an existing issue, for that codebase, it started appearing with non-id pkeys, so I think it's worth fixing in this PR.
I'm curious what is the best way to check that primary_key_col is auto-incremented with SQLA. Some ideas:

Open to others! CC: @vineetgopal

and primary_key_col.table == base_mapper.local_table
Comment on lines +35 to +37
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you follow the single-indent style from before?

Suggested change
python_column_type == int
and primary_key_col.autoincrement in ("auto", True)
and primary_key_col.table == base_mapper.local_table
python_column_type == int
and primary_key_col.autoincrement in ("auto", True)
and primary_key_col.table == base_mapper.local_table

We should eventually set up a code formatter for this codebase to automate.

)


def _get_primary_key_name(base_mapper):
primary_key_cols = base_mapper.primary_key
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I noticed an existing issue that the module assumes the attribute name and column name are the same, but there could be exceptions to that if a caller defines attribute_name = db.Column("column_name", db.Integer, primary_key=True). This could've affected an id column name and could come up with non-id pkey column names.
My company's codebase has some columns like these, though none are pkeys, so I'm not sure how often it comes up in practice. So, to make the assumption clear, could we assert that the attribute name and column name are the same?

Here's a start, but open to simpler implementations:

[primary_key_col_attr] = [c for c in base_mapper.column_attrs if primary_key_col in c.columns]
assert primary_key_col.name == primary_key_col_attr.key, \
    "_get_primary_key_name only supports columns whose name equals their attribute name"

[primary_key_col] = primary_key_cols
return primary_key_col.name


def _get_id_sequence_name(base_mapper):
assert _has_normal_id_primary_key(base_mapper), "_get_id_sequence_name only supports id primary keys"
return "%s_id_seq" % base_mapper.entity.__tablename__
assert _has_normal_id_primary_key(base_mapper), \
"_get_id_sequence_name only supports single auto increment primary keys"
primary_key = _get_primary_key_name(base_mapper)
return f"{base_mapper.entity.__tablename__}_{primary_key}_seq"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This brought up a good discussion on how backward compatible we should be with Python 2. For this release, our answer is yes, let's remain backward compatible.

Accordingly, could you change to the old-style string formatting from before?

Suggested change
return f"{base_mapper.entity.__tablename__}_{primary_key}_seq"
"%s_%s_seq" % (base_mapper.entity.__tablename__, primary_key)



def tuples_to_scalar_list(tuples):
Expand Down Expand Up @@ -153,11 +160,13 @@ def batch_populate_primary_keys(
if skip_unsupported_models:
continue
else:
raise AssertionError("Expected models to have auto-incrementing `id` primary key")
raise AssertionError("Expected models to have auto-incrementing primary key")

primary_key = _get_primary_key_name(base_mapper)
Copy link
Contributor

@somakrdas somakrdas Jan 22, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you rename to primary_key_name to distinguish this pkey name (like the _get_primary_key_name function does) from other possible variables like pkey column (primary_key_col) or value?

Suggested change
primary_key = _get_primary_key_name(base_mapper)
primary_key_name = _get_primary_key_name(base_mapper)


# In general, batch_populate_primary_keys shouldn't assume anything about how people are creating
# models - it is possible for models to have their ids already specified.
models = [model for model in models if model.id is None]
models = [model for model in models if getattr(model, primary_key) is None]

if skip_if_single_model and len(models) <= 1:
continue
Expand All @@ -166,7 +175,7 @@ def batch_populate_primary_keys(

models = sorted(models, key=lambda model: sqlalchemy.inspect(model).insert_order)
for id_, model in zip(new_ids, models):
model.id = id_
setattr(model, primary_key, id_)


def enable_batch_inserting(sqla_session):
Expand Down