-
Notifications
You must be signed in to change notification settings - Fork 104
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
clean up relation usage and default behavior #92
Conversation
@jtcohen6 the import is failing but I have no idea why, but I'm willing to bet it's just a rookie python typo. |
Co-authored-by: Jeremy Cohen <[email protected]>
…to relation-construction
issuesomething weird is happening with this jinja-sql
compiles to this
However, it fails starting with step 13 of the base test sequence (running the
@jtcohen6, does the
@dataclass
class SQLServerQuotePolicy(Policy):
database: bool = False
schema: bool = False
identifier: bool = False desired behaviorTL;DR I'd like
side note, square brackets are the defacto TSQL way to quote relation parts and column names, but not sure if that's possible to set |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jtcohen6, does the global_project's incremental materialization (and others) change the
QuotePolicy
somehow? wild guess -- does the{{ this }}
relation object inherit the adapter's quoting policy?
Hm... the {{ this }}
relation object definitely inherits the adapter's quoting policy. It also doesn't seem like {{ this }}
is directly involved here. To be clear, dbt-sqlserver
reimplements the incremental materialization, so I'd think that this line is performing the bad drop
(because swappable
indeed exists_not_as_table
, it exists as a view). I'm not sure what's causing it to quote old_relation
, though. Feels like your best bet is to take this for a spin locally—create a model as a view, switch it to incremental, add a lot of logging between lines 17 and 32 of the incremental
materialization, and see when exactly it starts being quoted.
While on this subject, it's also worth checking to see if the default incremental
materialization still needs overriding—the only difference I can tell is explicitly writing {{ tmp_relation.include(schema=False, database=False) }}
, rather than implicitly with {{ tmp_relation }}
as the default does, even though the result (at least on dbt-postgres
) is exactly the same.
@@ -145,18 +132,18 @@ | |||
type='view')-%} | |||
{%- set temp_view_sql = sql.replace("'", "''") -%} | |||
|
|||
{{ sqlserver__drop_relation_script(tmp_relation) }} | |||
{{ sqlserver__drop_relation(tmp_relation) }} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe this, and the other instances below, can just be {{ adapter.drop_relation(tmp_relation) }}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good catch! qq: why adapter.drop_relation()
and not just drop_relation()
? is it to make sure the macro can't be overwritten by a macro in the project that is also named drop_relation
?
@jtcohen6 I've pulled in changes from #102 which drops the overriding of the incremental materialization. Unfortunately, the same issue is still happening, so I'm going to start locally logging/debugging the global project's incremental materialization. |
61f7a87
to
ddd0b2a
Compare
0427207
to
18903b4
Compare
class SQLServerRelation(BaseRelation): | ||
quote_policy: SQLServerQuotePolicy = SQLServerQuotePolicy() | ||
include_policy: SQLServerIncludePolicy = SQLServerIncludePolicy() | ||
quote_character: str = '"' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mikaelene this is the one thing I'm not sure about. can you forsee any issues using "
instead of '
to quote relation parts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not a guru in quoting :-). Did we have '
before?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah we did, inexplicably though, the code change when I change it back to single quote. You're right that we should think about this change maybe breaking existing snapshot or incremental tables...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will definitely have to change this back to include database=True
by default to make cross-database selects work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The code look really good. There is a lot of changes. How well have you tested this? Have you run it on a real dbt-project or just in the CI?
Great question. I actually just tried running a project that was previously run with (I believe) I haven't seen this error message before, maybe @jtcohen6 could explain. It seems it has to do with partial matches due to quoting differences? this might constitute a breaking change for users. What I don't understand is why:
|
Yes, that's right. If dbt finds an existing relation in the database that is similar to a model's relation-to-e, but not exactly the same (different quoting/casing), dbt will raise a compilation error to avoid accidentally overwriting a view/table it doesn't actually own. Here's the source for that behavior. It sounds like you ran this same project previously, using
This is the part I'm not sure about! If the
The reason to keep quoting is that, on several databases (e.g. Postgres), identifier names that start with certain characters must be quoted: |
5267d23
to
8d5f219
Compare
great question @jtcohen6. Looks like there wasn't a good reason and I just did it to try and solve a bug that I stirred up along the way. It isn't necessary. It is however necessary to change our relation part quote character from a single quote to double. @mikaelene I tested this on a local project and it works without issue. ready for for your further review/approval. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems sane. But It is a lot of code to review. I get a little to comfortable with the CI tests sometimes. I think I introduced a bug in snapshots. #108 . So I will review this when I have time.
class SQLServerRelation(BaseRelation): | ||
quote_policy: SQLServerQuotePolicy = SQLServerQuotePolicy() | ||
include_policy: SQLServerIncludePolicy = SQLServerIncludePolicy() | ||
quote_character: str = '"' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not a guru in quoting :-). Did we have '
before?
WHERE name='{{ from_relation.schema }}_{{ from_relation.identifier }}_cci' and object_id = OBJECT_ID('{{ from_relation.schema }}.{{ to_relation.identifier }}')) | ||
EXEC sp_rename N'{{ from_relation.schema }}.{{ to_relation.identifier }}.{{ from_relation.schema }}_{{ from_relation.identifier }}_cci', N'{{ from_relation.schema }}_{{ to_relation.identifier }}_cci', N'INDEX' | ||
WHERE name='{{ from_relation.schema }}_{{ from_relation.identifier }}_cci' and object_id = OBJECT_ID('{{ from_relation }}')) | ||
EXEC sp_rename N'{{ from_relation }}.{{ from_relation.schema }}_{{ from_relation.identifier }}_cci', N'{{ from_relation.schema }}_{{ to_relation.identifier }}_cci', N'INDEX' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are we prefixing with the entire relation now? That includes double-quoted schema and relation names, which is creating really bizarre output in my run:
EXEC sp_rename N'"dbo"."APPT_COMPLETED_MONTHLY__dbt_tmp".dbo_APPT_COMPLETED_MONTHLY__dbt_tmp_cci', N'dbo_APPT_COMPLETED_MONTHLY_cci', N'INDEX'
@swanderz I've spent several hours today trying to reimplement database switching (see PR #111) in this branch. However, I've run into a wall with the function
Because the temporary view is dropped twice, something screwy happens with the relation cache and the second drop view statement doesn't happen, leaving the temporary view in the database. I'm not certain what exactly is happening because honestly the relation cache is difficult to understand. My thought is to just drop all uses of Is it OK to replace all uses of |
@panasenco -- I very much appreciate you taking the time to try and make your PR with my refactor. TIL, I had no idea about the relation cache. Totally open to the "straightforward SQL" approach, but any chance you could open up a different PR with the code you were stuck on so we can look at the error you were seeing? If anything, for curiosity's sake. |
changes
relation.py
which sets the relation:"
to quote relation parts, andschema.identifier
{{ relation }}
instead of{{ relation.schema }}.{{ relation.identifier }}
)adapter.drop_relation
which deprecatessqlserver__drop_relation_script
snapshot_strategy_check_cols
test for Azure SQL (should have been re-enabled with Fix for passing the snapshot adapter test. Fixes #61 #74)