Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TS model: declarative mapping + prep for SQLAlchemy 2.0 #12666

Merged
merged 55 commits into from
Oct 14, 2021

Conversation

jdavcs
Copy link
Member

@jdavcs jdavcs commented Oct 8, 2021

Progress towards #10369 and #12541.

Justification: cleaning up and remapping the TS model declaratively simplifies migration to Alembic (not migrating the TS would be even simpler, but I think we can avoid that, and maintain a unified code base). Removing SAWarnings may be relevant for SQLAlchemy 2.0 (I don't know yet whether these warnings would have triggered errors with all the 2.0 flags enabled, but they all indicated mapping issues in any case).

Done:

  • Unit tests for all TS models verifying mapping
  • Remapped declaratively all but RepositoryMetadata (model contains .metadata attribute preventing declarative mapping; workaround would be intrusive and, I think, not really necessary. Leaving as is.)

Notable fixes:
(see commit messages for details)

Technical dept:

  • Remove dreadful code duplication from testing code (model / TS model testing setup). I'll handle this in a separate PR: I think it could be slightly more than a quick edit, and it feels too much out of scope given the focus of this PR.

How to test the changes?

  • I've included appropriate automated tests.
  • This is a refactoring of components with existing test coverage.

License

  • I agree to license these contributions under Galaxy's current license.
  • I agree to allow the Galaxy committers to license these and all my past contributions to the core galaxy codebase under the MIT license. If this condition is an issue, uncheck and just let us know why with an e-mail to [email protected].

@jdavcs jdavcs added kind/bug status/WIP kind/refactoring cleanup or refactoring of existing code, no functional changes area/testing area/toolshed area/database Galaxy's database or data access layer labels Oct 8, 2021
@jdavcs jdavcs added this to the 22.01 milestone Oct 8, 2021
@jdavcs jdavcs force-pushed the dev_ts_declarative branch from ba30ba3 to 82a56b2 Compare October 9, 2021 00:49
@jdavcs jdavcs force-pushed the dev_ts_declarative branch 5 times, most recently from 95079dd to 339f8a2 Compare October 13, 2021 00:03
@jdavcs jdavcs removed the status/WIP label Oct 13, 2021
@jdavcs jdavcs marked this pull request as ready for review October 13, 2021 00:48
jdavcs added 21 commits October 13, 2021 10:16
1. Add viewonly=True: we don't want this relationship to write to the db
2. Remove redandant relationship attributes (the joins are derived
   automatically based on existing foreign keys)
3. Fix test
There were 2 relationships defined on RepositoryMetadata: review and
reviews. The former (review) must have been added unintentionally via
backref. It was not referenced in the codebase, and its name was
incorrect too: that relationship holds a list of reviews.

However, the foreign_key attribute was CORRECT, whereas the correct
relationship (reviews) had an incorrect foreign_key attribute, which
resulted in an incorrect set of reviews (the size of the set was at most
1, which is incorrect). For an example of correct usage, see
https://docs.sqlalchemy.org/en/14/orm/relationship_api.html?highlight=foreign_keys#sqlalchemy.orm.relationship.params.foreign_keys

This commit corrects the foreign_keys attribute and fixes the unit test
that verifies correct behavior.

Victory at last!
Do not remap declaratively. The model contains '.metadata' attr; a
declaratively-mapped class cannot have a .metadata attribute (it is used
by SQLAlchemy's DeclarativeBase). Given that TS is currently (mostly) in
maintenance code, it is reasonable to leave this as is rather than
handle all references to this attribute in the codebase.
@jdavcs jdavcs force-pushed the dev_ts_declarative branch from 339f8a2 to 4fe97b6 Compare October 13, 2021 14:17
@jdavcs
Copy link
Member Author

jdavcs commented Oct 13, 2021

Rebased. Should take care of the mypy test failure.


# Misc. helper fixtures.

@pytest.fixture(scope='module')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This sort of makes it less like a unit test. I don't think it's a huge problem, and the benefit is faster test runs, but pointing this out since you mentioned some issues in the last backend call. class might be another scope that makes sense ? Assuming table creation is the slowest part, you could create the sqlite file once on disk and then create a copy for each test.

Copy link
Member Author

@jdavcs jdavcs Oct 14, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This in-memory database gets cleaned up after each test (the data, not the db objects). That, I think, keeps tests independent. At least that's how I designed it to work (I was almost religious about automating this post-test cleanup, but I could've screwed up 😆 But, I think/hope it works as expected - the db is clean at the start of each test, and so it shouldn't be an issue. (i'm not sure what will happen if tests run in parallel though - I haven't considered that)
I went through many iterations of this setup for the main model and I did try doing it without the scope argument, so that the database was recreated for each test - it made it very noticeably slower (like you said - table creation). I didn't consider making a copy of the initially loaded database file though. If test overlap were/is a problem, that would be a reasonable alternative, I think.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This in-memory database gets cleaned up after each test (the data, not the db objects). That, I think, keeps tests independent.

dbcleanup ? I don't think that hits every related object (local tests for instance are failing for me, but that could be for other reasons as well) and you could eliminate this.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's not good - they never failed for me. It's designed to remove every object that gets loaded during the test. If the test uses a factory fixture, that should be removed manually via delete_from_database() calls - missing a call to those would be the most likely cause.. If not, then it's not working as intended. What error are you getting?

Copy link
Member

@mvdbeek mvdbeek Oct 14, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will look into this next time I see it, I'm just thinking that a new database for each test is conceptually much easier than hoping you catch all objects that a test creates.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks; I'll also take another look.

...I'm just thinking that a new database for each test is conceptually much easier than hoping you catch all objects that a test creates.

Indeed, it could be a better alternative. I am not completely sure it will work (it might - I just don't know yet), given all the other model instances most tests create. The main model mapping tests (in test/unit/data/model) create 900+ helper model instances just to setup the actual tests: a model A needs B to be instantiated, B needs C, etc. It's turtles models all the way down 😃 I'm not sure (yet) how they would all share the same db file that is created for each test. But maybe it's quite simple - I don't know yet. So, I think, if we run into an issue and this setup turns out to be flaky, I'll definitely explore this option.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll merge this; but I think I've just came up with a reasonable way to verify the db is clean before each test - will post as a separate PR.
Thanks for reviewing and the discussion!

Copy link
Member

@mvdbeek mvdbeek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very cool

@jdavcs jdavcs merged commit 202bff4 into galaxyproject:dev Oct 14, 2021
@jdavcs jdavcs deleted the dev_ts_declarative branch October 22, 2021 16:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/database Galaxy's database or data access layer area/testing area/toolshed kind/bug kind/refactoring cleanup or refactoring of existing code, no functional changes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants