Feature: cost-effective merge for partitioned incremental models on BigQuery #1971

jtcohen6 · 2019-12-03T15:57:59Z

Attempt an implementation of the solution suggested by @jarlainnix in dbt-labs/dbt-bigquery#1034

When running an incremental model on BigQuery incrementally, IFF the model is partitioned, dbt should:

create a temporary table using the model SQL (including is_incremental() input set limit)
run a statement to get the min and max partition values from that temporary table
filter by the partition range in the destination (already existing) table when merging to limit query cost

The way I've implemented this is quite verbose and requires adding a keyword arg to the contract of get_merge_sql. A better approach may leverage **kwargs. I'm very open to feedback.

jarlainnix · 2019-12-05T09:28:57Z

🤞 @jtcohen6 Thanks a lot. I hope it goes through. Can't wait for it. I believe it makes dbt_ better prepared for the bigger data sets and stronger to withstand the test of time. 😃

drewbanin

This is groovy, thanks for sending over the PR @jtcohen6!

I have some tiny thoughts about the code - I've dropped them inline here. I think the approach is a good one though. Let me spend some time testing this out :)

core/dbt/include/global_project/macros/materializations/common/merge.sql

plugins/bigquery/dbt/include/bigquery/macros/materializations/incremental.sql

drewbanin

see comments

plugins/bigquery/dbt/include/bigquery/macros/materializations/incremental.sql

inve1 · 2019-12-13T00:41:57Z

Hey, I was looking to do something like this and stumbled upon your PR.. I tried it and it works well

I have a couple of questions/comments:

why not select just the partition_by field and create a temp table with just that field? it would make that scan cheaper if the original select had a lot more columns (I think if you do something like select {{partition_by}} from ({{original_select}}) bigquery is smart enough to only scan that 1 column even if more were mentioned in the original)
if you don't want to just select the partition_by field for some reason that went over my head, why not set source_sql in the merge as

(
    select * from {{tmp_relation}}
)

in situations where tmp_relation gets created? If the original sql reduced the size from the original data, scanning the filtered/aggregated tmp_relation can be much cheaper.. I tried that and it cut the scanned GB to 1/4 in the data I was testing with

I'm looking at creating incremental models that look at something like the last 3 days of data and just overwrite those partitions at an interval. In that case I wouldn't need to figure out what the partition range is, it could be generated at compile time. I don't really have a suggestion here, but do you think this use case could be supported in a nice way after the change here? maybe allow passing dest_partition from the model?

Anyway, thanks for doing this, it's really helpful 👍

jtcohen6 · 2019-12-13T16:36:39Z

@inve1 Thanks for trying it out, and for the great ideas/comments!

Even cheaper!

Between your two suggestions, I prefer the second, where we merge using the temporary table we previously created for the purposes of getting partition min/max:

     {% set source_sql -%}
       (
         select * from {{tmp_relation}}
       )
     {%- endset -%}

After some quick testing, I think this would be the most cost-effective.

1. Create partitioned temp table: charged for full scan of data needed by {{sql}} (model SQL)
2. Get partition min/max from partitioned temp table: negligible
3. Merge from temp table into existing table: charged for full scan of only the temp table + scan of selected partitions in preexisting table

Per your first suggestion, there may be some cases where it's extremely cheap to get the partition min/max values from the model SQL as

select min({{partition_by}}, max({{partition_by}}) from (
         {{ sql }}
       )

but then we would be charged the entire cost of the model SQL over again when we want to merge in step 3.

For unpartitioned tables, we should still elide the need to create a temp table and use

     {% set source_sql -%}
       (
         {{ sql }}
       )
     {%- endset -%}

because we're only going to be using the model SQL once, in the merge statement.

Partition overwrite

This is a great question—and I think it's asking after a genuinely different materialization strategy, one that performs a counterintuitive (but highly effective) merge like:

merge into {{ target }} as DBT_INTERNAL_DEST
        using {{ source }} as DBT_INTERNAL_SOURCE
        on
            FALSE

    when not matched by source
        and DBT_INTERNAL_DEST.{{partition_by}} between {{partition_min}} and {{partition_max}}
        then delete

    when not matched then insert
        ({{ dest_cols_csv }})
    values
        ({{ dest_cols_csv }})

This to me is an insert_overwrite incremental strategy, rather than a unique_key strategy, and it's very much in line what how we use incremental models in Spark.

@drewbanin What do you think about adding first-class support for this as an alternative strategy for incremental models on BQ, in the same way that we support two incremental strategies on Snowflake?

inve1 · 2019-12-13T17:31:44Z

Yeah I agree the "selecting the temp table" approach seems a little better.

On that note, you could create a real bigquery temp table and do the whole merge job in one request.
I took the 3 queries that this pr generated for my model and created a sql like this

DECLARE
  partition_min,
  partition_max timestamp; 
CREATE TEMP TABLE scans__dbt_tmp -- scans is just the name of my model
PARTITION BY
  DATE(start_time) AS (
  SELECT
    ......
  FROM
   .....
;
SET
  (partition_min,
    partition_max) = (
  SELECT
    AS STRUCT MIN(start_time), -- partition_by column
    MAX(start_time)
  FROM
    scans__dbt_tmp);

MERGE INTO
  .... AS DBT_INTERNAL_DEST
USING
  (
  SELECT
    *
  FROM
    `scans__dbt_tmp` ) AS DBT_INTERNAL_SOURCE
ON
  DBT_INTERNAL_DEST.start_time BETWEEN partition_min AND partition_max
  AND {{unique_key_constraint_here}}
  WHEN MATCHED THEN UPDATE SET ....
  WHEN NOT MATCHED
  THEN ..... VALUES ......

This has the advantage of not creating a table you can see in your datasets, not getting charged for storing the temp table for 12 hours and scanning from the temp table is free.
I actually have models that make more rows than the source had, so technically re-scanning the source as implemented here originally would be cheaper, but with this approach that case would be handled too nvm just read the right docs, temp table scans are charged, just the storage isn't
The only issue (not sure what you guys' policy is on this) is that bigquery scripting is in beta

Link to docs in case you haven't seen these

jtcohen6 · 2019-12-13T21:01:40Z

That's really interesting stuff! I've only done a little reading about "true" temp tables and scripting in BQ, since they seem to be quite new as features. ~~It looks like scripts are BQ's way of handling atomic transactions that require several DDL statements.~~

I'll definitely talk more about it with some of the folks here. For the moment, I think the use of merge gives this operation full atomicity. The only downside of the current approach is, as you say, the annoyance and (minuscule) cost of a __dbt_tmp table that hangs around for 12 hours.

Edit: @drewbanin and I had a chance to play around with scripting quickly. Very cool. It seems like this is a much more straightforward way to code up exactly what we want here. I'll give this a spin soon.

…eature/cost-effective-bq-incremental

jtcohen6 · 2020-01-05T20:19:45Z

I've rewritten this update to the BQ incremental materialization such that it now leverages the beta scripting feature.

This required some additional code in bigquery__create_table_as to enable the creation of "true" (scripting-only) temporary tables. ~~In that process, I found that temp tables cannot include partition by clauses, else querying them returns an empty result set.~~ [Edit 20 Jan: this was due to a bug that seems to have been resolved since.]

Extensions to incremental materialization

Now that this script-based framework has helped in making the code more straightforward, I think it will be very simple to enable the insert_overwrite incremental strategy by creating a common_get_insert_overwrite_merge_sql macro that executes the merge on false approach I sketched out above. I'll take a look at the Snowflake incremental materialization for the code that enables strategy-picking.

Extensions to scripting

Based on the work in #1967, I'd love to allow users to specify their is_incremental() filter as BQ scripting SET statement inside a sql_header block, instead of needing to call a statement that runs select max() from {{this}}.

In the simplest case, that looks like:

{%- if is_incremental() -%}
{%- call sql_header -%}
DECLARE
    max_from_this date;

SET
    (max_from_this) = (select as struct max(my_date) from {{this}});
{%- endcall -%}
{%- endif -%}

with base as (

    select * from source_table
    {%- if is_incremental() -%}
        where my_date >= max_from_this
    {%- endif -%}
)

The trickiness comes with integrating the script-based work I've done here. Since each BQ script can only have one DECLARE statement and it must come first—though the declared variables can be defined in multiple SET statements throughout—I plan to add a declare keyword to the config that takes an array of scripting variables to be defined first.

Edit: Caveat

I just reread the BQ docs and saw that they have beta support for integer-based partitioning. There are a couple of places in this new script-based approach that presume the partitioning column to be either date or date(timestamp). I think we will need to either:
(a) Update dbt's partition_by config to enable the user to supply a data type (date, timestamp, integer)
(b) Grab the data type of the partition_by column from information schema

…eature/cost-effective-bq-incremental

jtcohen6 · 2020-01-20T03:15:35Z

New `partition_by` spec

In order to support the smarter, cost-effective, scripting-based incremental materialization and new integer range partitioning (beta), I propose changing the BigQuery partition_by config argument to accept a dictionary.

This works for date columns:

{{config(
    materialized = 'incremental',
    partition_by = { 'field': 'my_date_column', 'data_type': 'date' }
}}

This will compile simply to partition by my_date_column at time of table creation.

As well as timestamp/datetime columns:

{{config(
    materialized = 'incremental',
    partition_by = { 'field': 'my_ts_column', 'data_type': 'timestamp' }
}}

This compiles to partition by date(my_ts_column) in create table DDL.

And finally, integer range partitioning:

{{config(
    materialized = 'incremental',
    partition_by = {
        'field': 'my_int_column',
        'data_type': 'int64',
        'range': { 'start': 0, 'end': 100, 'interval': 10 }
    }
}}

This compiles to partition by range_bucket(my_int_column, generate_array(0, 100, 10)). It matches the bq API spec in these docs.

Next steps

Deprecation/migration: I created an adapter method, parse_partition_by, that accepts whatever the user has supplied to config.partition_by and checks to ensure it's a dictionary (new spec), raises an error (if supplied string containing range_bucket), or raises a deprecation warning and attempts to parse into a dictionary (if date or timestamp). I drafted deprecation and error warnings and would appreciate guidance on wording. Both have unresolved # TODOs, since I'll need to write and link to updated docs.

Other consideration: I made some updates to the is_replaceable BigQuery adapter method, such that it now supports range_partitioning as well as time_partitioning.

There are some failing tests that I need to investigate.

[Updated 20 Jan]

clausherther · 2020-01-20T23:51:41Z

Very excited about this! Do you know if this will allow us to run this incremental strategy on tables that have partition filter requirements?
We ran into an issue trying to do incremental runs on tables we set to require filters (from within dbt) and have so far had to resort to disabling filter requirements. Even when I manually wrote the merge statement, I couldn't figure where to tell it to filter the destination table partitions.

https://getdbt.slack.com/archives/C2JRRQDTL/p1579289882011400

jtcohen6 · 2020-01-21T02:58:59Z

@clausherther Yes! These merge statements will work for tables that have require_partition_filter = true.

Out of curiosity, how are you configuring that flag with dbt? I don't think it's explicitly supported as a config arg (yet). IMO it also breaks some surprising things, like schema tests, though rightly so to avoid scanning a lot of data.

clausherther · 2020-01-21T13:25:54Z

@jtcohen6 we set that option via an alter table... model post_hook (similar to how we currently apply BQ table labels).
W/r/t to schema tests, we have so far tried overriding the built-in tests to allow for a filter argument. While that technically works, it's not a great option, since you're only testing the schema over the filtered partitions. So, something like a unique tests wouldn't really be testing the right rows.

jtcohen6 · 2020-01-22T14:21:58Z

@drewbanin @beckjake I'm ready to hand this over for expert appraisal.

I drafted a few compilation errors and deprecation warnings, which will need to link to new docs.
New adapter method parse_partition_by might benefit from a unit test. It's a lot of if/elif and a little bit of regex.
BigQuery now supports "real" temp tables, but only inside of scripts (= several SQL statements, separated by ;, the first of which is declare). Drew and I wondered if, in the future, we might rewrite all BQ temp table calls to occur inside of scripts, and do away with the 12-hour expiration + drop. For now, the way I'm differentiating "scripting" temp tables is an explicit hack.

beckjake

This looks great! I have a couple style comments

Can you add a test that the deprecated version works properly?

The 012_deprecation_tests folder has some tests, TestDeprecations should be a decent example: Set up a test that attempts to use the old behavior and runs with strict=True, which should fail. With strict=False, it should pass and generate a properly partitioned table.

plugins/bigquery/dbt/include/bigquery/macros/adapters.sql

core/dbt/context/base.py

beckjake

lgtm!

drewbanin

@jtcohen6 I just changed the base branch to dev/barbara-gittings (0.16.0) - looks like there are some merge conflicts to account for as a result.

I'll do a quick pass here too, but this looks really stellar!

jtcohen6 · 2020-01-27T22:44:10Z

The changes are good to merge into dev/barbara-gittings. There is an outstanding TODO, which is to write new docs and link to them in the compilation error + deprecation warning. I'll circle back on that next week.

hui-zheng · 2020-01-30T21:01:48Z

Hi @jtcohen6 and @drewbanin

That is exciting stuff. Thank you for sharing with me. I ran into some critical blockers in our dbt production that is related to this issue. We would like to patch the fix quickly.

Could I have some questions?

Is this PR feature completed and passed on most tests? Could you comment on the readiness of this PR? Though it's still in the dev branch, I am willing to take some risk to patch my current dbt with this change to resolve some of our critical challenges.
Do you have a suggestion on the right way to patch 0.15.1 with this PR fix? Shall I just take this branch feature/cost-effective-bq-incremental and merge it into my version of dbt 0.15.1? Shall I run some dbt tests after the merge to make sure nothing in dbt breaks?
Could you clarify if this PR is using the BigQuery true temp table as @inve1 suggested? or is it using the 12-hour expiration + drop table? I think you are using the true temp table for this PR but you also mentioned re-write all temp tables so just to confirm.
Does this PR implement first-class support for BQ insert_overwrite strategy? I assume that strategy will further reduce query cost and improve merge speed in those applicable scenarios.
In some incremental scenarios, it's common for incremental models to overwrite the data of a given DateTime range, (e.g. the past 24 hours). the partition range to overwrite is known at the SQL compile time. Do we consider a nice way for dbt to support partitioned merge in such a case?

For dbt-labs/dbt-bigquery#5, As @inve1 mentioned,

I'm looking at creating incremental models that look at something like the last 3 days of data and just overwrite those partitions at an interval. In that case, I wouldn't need to figure out what the partition range is, it could be generated at compile time. I don't really have a suggestion here, but do you think this use case could be supported in a nice way after the change here? maybe allow passing dest_partition from the model?

Thank you

drewbanin

This is looking really good! I left some comments here that I'd love to catch up with you about IRL!

I also cooked up a docs link as a placeholder, so we can merge this thing as soon as it's ready :)

drewbanin · 2020-01-31T03:02:08Z

core/dbt/deprecations.py

+    dbt inferred: {inferred_partition_by}
+
+    For more information, see:
+        [dbt docs link]


I made a placeholder guide here, to be completed before 0.16.0 is released. Can you add this link accordingly? https://docs.getdbt.com/docs/upgrading-to-0-16-0

drewbanin · 2020-01-31T03:20:43Z

core/dbt/include/global_project/macros/materializations/common/merge.sql

+    {% endif %}
+
+    {# BigQuery only #}
+    {%- if partition_by -%}


I don't love that we're encoding this BigQuery specific logic in the common merge implementation. Further, the pprint_partition_field macro used below is only defined in the BigQuery plugin. This probably won't be an issue, as the Snowflake code path won't be providing a partition_by arg here, but it does feel out of place to me.

I'd be comfortable just leaving the existing common_get_merge_sql as-is and instead baking this logic directly into the BigQuery implementation of get_merge_sql:

https://github.com/fishtown-analytics/dbt/blob/e080bfc79ab11a4d145de4c8da9d955c5e136d92/plugins/bigquery/dbt/include/bigquery/macros/materializations/merge.sql#L1-L3

Alternatively, we could make this a little bit more generic and change the partition_by arg for this macro to predicates (or similar). The BigQuery implementation of get_merge_sql could supply a list of predicates to this macro, but other implementations (like Snowflake) could provide an empty list.

Either way, let's move the partitioning-specific logic out of this macro.

drewbanin · 2020-01-31T03:20:59Z

core/dbt/include/global_project/macros/materializations/common/merge.sql

+    merge into {{ target }} as DBT_INTERNAL_DEST
+        using {{ source }} as DBT_INTERNAL_SOURCE
+        on
+        {% if conditions|length == 0 %}


This is a slick way of doing this 👍

drewbanin · 2020-01-31T03:28:07Z

plugins/bigquery/dbt/adapters/bigquery/impl.py

+            if 'range_bucket' in raw_partition_by.lower():
+                dbt.exceptions.CompilerException('''
+                    BigQuery integer range partitioning (beta) is supported \
+                    by the new `partition_by` config, which accepts a \


BigQuery integer range partitioning (beta) is only supported
with the dictionary format of the partition_by config.

More information: [Link from above]

^ I like using modifiers like only, as they help clarify the nature of these validation errors. These error messages should:

Explain what went wrong

Explain how to fix it

Also, this error message will invariably stick around for longer than we intend, so I try to avoid temporal words like "new", as things like that tend to cause chuckles a couple of years down the line :p

drewbanin · 2020-01-31T03:41:31Z

plugins/bigquery/dbt/adapters/bigquery/impl.py

+                    dictionary. See: [dbt docs link TK]
+                    ''')  # TODO
+            else:
+                p = re.compile(


This regex is a little too complicated for my liking... can we try something a little simpler instead? How about we try something like:

partition_by = raw_partition_by.strip() if partition_by.lower().startswith('date('): partition_by = re.match('date\((.*)\)', partition_by, re.IGNORECASE) data_type = 'date' else: data_type = 'timestamp' inferred_partition_by = ...

Let me know if I'm missing something important in the logic here!

drewbanin · 2020-01-31T03:46:18Z

plugins/bigquery/dbt/include/bigquery/macros/adapters.sql

+        {%- if partition_by_type in ('date','timestamp','datetime') -%}
+            partition by {{pprint_partition_field(partition_by_dict)}}
+        {%- elif partition_by_type in ('int64') -%}
+            {%- set pbr = partition_by_dict.range -%}


wouldn't mind a pbr

drewbanin · 2020-01-31T03:47:29Z

plugins/bigquery/dbt/include/bigquery/macros/adapters.sql

@@ -62,18 +77,25 @@
  {%- set raw_kms_key_name = config.get('kms_key_name', none) -%}
  {%- set raw_labels = config.get('labels', []) -%}
  {%- set sql_header = config.get('sql_header', none) -%}
+
+  {%- set partition_by_dict = adapter.parse_partition_by(raw_partition_by) -%}
+  {%- set is_scripting = (temporary == 'scripting') -%} {# "true" temp tables only possible when scripting #}


Can you just remind me where we ended up on this? BigQuery temp tables can't be used outside of scripting?

jtcohen6 · 2020-02-24T19:25:10Z

Closing in favor of #2140

elyobo · 2022-10-29T13:12:44Z

Would there be any interest in something like this so that we can get some improvement for integer keys that otherwise exceed the BQ script memory limits (e.g. dbt-labs/dbt-adapters#605)? It's extending the BQ implementation of the insert_overwrite to support the behaviour (as I understand it) from up above, but as an optional change - the regular insert_overwrite behaviour is available too, as it's more efficient if it does work.

jtcohen6 added 3 commits December 3, 2019 09:33

Prune partitions on merge dest

c603a17

Better nomenclature

1606364

Fix for non-partitioned tbls

76aee20

jtcohen6 requested a review from drewbanin December 3, 2019 15:57

cla-bot bot added the cla:yes label Dec 3, 2019

dbt-labs deleted a comment from Sherm4nLC Dec 4, 2019

drewbanin reviewed Dec 9, 2019

View reviewed changes

core/dbt/include/global_project/macros/materializations/common/merge.sql Outdated Show resolved Hide resolved

drewbanin reviewed Dec 9, 2019

View reviewed changes

plugins/bigquery/dbt/include/bigquery/macros/materializations/incremental.sql Outdated Show resolved Hide resolved

drewbanin reviewed Dec 9, 2019

View reviewed changes

jtcohen6 added 5 commits December 11, 2019 16:55

Use run_query instead of statements

39b8ddc

Add re module. Regex for partition colname

87bd25a

Merge remote changes

19229c4

Clearer merge condition handling

dd8ce8a

Passing unit tests

3e3e2e6

jtcohen6 added 2 commits December 13, 2019 15:19

Reuse temp table in merge. Add DATE() as appropriate

1852ec3

Fix for date-type partitions

7d62d69

drewbanin mentioned this pull request Dec 20, 2019

Implement a delete+insert incremental_strategy for Google BigQuery #2020

Closed

jtcohen6 added 2 commits January 5, 2020 12:53

Merge branch 'dev/0.15.1' of github.com:fishtown-analytics/dbt into f…

2280ad7

…eature/cost-effective-bq-incremental

Use BQ scripting for partition-aware incremental

a599f0f

jtcohen6 added 3 commits January 19, 2020 20:20

New BQ partition_by dict spec

b60a5ee

Merge branch 'dev/0.15.1' of github.com:fishtown-analytics/dbt into f…

e92ae21

…eature/cost-effective-bq-incremental

Update is_replaceable to support range partitioned tables

3605449

jtcohen6 added 5 commits January 21, 2020 19:47

Options on temp tables not supported

2334cef

Fix failing integration tests

5423b59

TIL about CQS. BQ int tests fit spec

81c54c5

Rm flakey spaces

b6a2e6d

Fix partition_by docs mismatch

a7e260f

jtcohen6 requested a review from beckjake January 22, 2020 14:22

beckjake reviewed Jan 22, 2020

View reviewed changes

plugins/bigquery/dbt/include/bigquery/macros/adapters.sql Outdated Show resolved Hide resolved

core/dbt/context/base.py Outdated Show resolved Hide resolved

Add bq-partition-by deprecation test

6b76cd0

beckjake approved these changes Jan 27, 2020

View reviewed changes

drewbanin changed the base branch from dev/0.15.1 to dev/barbara-gittings January 27, 2020 17:11

drewbanin reviewed Jan 27, 2020

View reviewed changes

Merge dev/barbara-gittings

dcced8a

jtcohen6 mentioned this pull request Jan 30, 2020

BQ incremental merge statements respect dest date partitioning #1034

Closed

drewbanin requested changes Jan 31, 2020

View reviewed changes

pairing with @JerCo

a1e8688

cmcarthur assigned drewbanin Feb 12, 2020

drewbanin mentioned this pull request Feb 15, 2020

Feature/cost effective bq incremental followup #2140

Merged

6 tasks

jtcohen6 closed this Feb 24, 2020

kwigley deleted the feature/cost-effective-bq-incremental branch February 12, 2021 14:05

This was referenced Oct 29, 2022

[CT-2051] [Bug] New insert_overwrite Bigquery partitioning with integer keys can create huge temporary array variables, exceeding BQ limits dbt-labs/dbt-adapters#605

Open

Subvert insert_overwrite merge strategy to bring back merge raywhite/dbt-bigquery#1

Closed

elyobo mentioned this pull request Nov 2, 2022

Subvert insert_overwrite merge strategy to bring back merge dbt-labs/dbt-bigquery#371

Closed

6 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature: cost-effective merge for partitioned incremental models on BigQuery #1971

Feature: cost-effective merge for partitioned incremental models on BigQuery #1971

jtcohen6 commented Dec 3, 2019

jarlainnix commented Dec 5, 2019

drewbanin left a comment

drewbanin left a comment

inve1 commented Dec 13, 2019

jtcohen6 commented Dec 13, 2019 •

edited

Loading

inve1 commented Dec 13, 2019 •

edited

Loading

jtcohen6 commented Dec 13, 2019 •

edited

Loading

jtcohen6 commented Jan 5, 2020 •

edited

Loading

jtcohen6 commented Jan 20, 2020 •

edited

Loading

clausherther commented Jan 20, 2020

jtcohen6 commented Jan 21, 2020

clausherther commented Jan 21, 2020

jtcohen6 commented Jan 22, 2020

beckjake left a comment

beckjake left a comment

drewbanin left a comment

jtcohen6 commented Jan 27, 2020

hui-zheng commented Jan 30, 2020 •

edited

Loading

drewbanin left a comment

drewbanin Jan 31, 2020

drewbanin Jan 31, 2020

drewbanin Jan 31, 2020

drewbanin Jan 31, 2020

drewbanin Jan 31, 2020

drewbanin Jan 31, 2020

drewbanin Jan 31, 2020

jtcohen6 commented Feb 24, 2020

elyobo commented Oct 29, 2022

Feature: cost-effective merge for partitioned incremental models on BigQuery #1971

Feature: cost-effective merge for partitioned incremental models on BigQuery #1971

Conversation

jtcohen6 commented Dec 3, 2019

jarlainnix commented Dec 5, 2019

drewbanin left a comment

Choose a reason for hiding this comment

drewbanin left a comment

Choose a reason for hiding this comment

inve1 commented Dec 13, 2019

jtcohen6 commented Dec 13, 2019 • edited Loading

Even cheaper!

Partition overwrite

inve1 commented Dec 13, 2019 • edited Loading

jtcohen6 commented Dec 13, 2019 • edited Loading

jtcohen6 commented Jan 5, 2020 • edited Loading

Extensions to incremental materialization

Extensions to scripting

Edit: Caveat

jtcohen6 commented Jan 20, 2020 • edited Loading

New partition_by spec

Next steps

clausherther commented Jan 20, 2020

jtcohen6 commented Jan 21, 2020

clausherther commented Jan 21, 2020

jtcohen6 commented Jan 22, 2020

beckjake left a comment

Choose a reason for hiding this comment

beckjake left a comment

Choose a reason for hiding this comment

drewbanin left a comment

Choose a reason for hiding this comment

jtcohen6 commented Jan 27, 2020

hui-zheng commented Jan 30, 2020 • edited Loading

drewbanin left a comment

Choose a reason for hiding this comment

drewbanin Jan 31, 2020

Choose a reason for hiding this comment

drewbanin Jan 31, 2020

Choose a reason for hiding this comment

drewbanin Jan 31, 2020

Choose a reason for hiding this comment

drewbanin Jan 31, 2020

Choose a reason for hiding this comment

drewbanin Jan 31, 2020

Choose a reason for hiding this comment

drewbanin Jan 31, 2020

Choose a reason for hiding this comment

drewbanin Jan 31, 2020

Choose a reason for hiding this comment

jtcohen6 commented Feb 24, 2020

elyobo commented Oct 29, 2022

jtcohen6 commented Dec 13, 2019 •

edited

Loading

inve1 commented Dec 13, 2019 •

edited

Loading

jtcohen6 commented Dec 13, 2019 •

edited

Loading

jtcohen6 commented Jan 5, 2020 •

edited

Loading

jtcohen6 commented Jan 20, 2020 •

edited

Loading

New `partition_by` spec

hui-zheng commented Jan 30, 2020 •

edited

Loading