Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature/conversions #13

Merged
merged 27 commits into from
Oct 24, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
27 commits
Select commit Hold shift + click to select a range
4e58854
add under the hood updates
fivetran-reneeli Oct 10, 2024
626ed94
add conversion metrics to end models
fivetran-reneeli Oct 11, 2024
9b9d2f5
docs and add passthrough
fivetran-reneeli Oct 11, 2024
b66f672
update passthrough
fivetran-reneeli Oct 12, 2024
12b8c3a
changelog and versioining
fivetran-reneeli Oct 14, 2024
6fad923
readme and docs
fivetran-reneeli Oct 15, 2024
b2c822c
add decisionlog about discrepancies across different grains
fivetran-reneeli Oct 15, 2024
995b7f4
changelog
fivetran-reneeli Oct 15, 2024
c916ac8
realized we can remove total_items from the transforms
fivetran-reneeli Oct 15, 2024
3aeeef6
add validation tests
fivetran-reneeli Oct 15, 2024
a657d45
fix
fivetran-reneeli Oct 15, 2024
197bae0
new schema
fivetran-reneeli Oct 15, 2024
68571f0
update config
fivetran-reneeli Oct 15, 2024
fbd66cb
change schema for all
fivetran-reneeli Oct 15, 2024
63cb8c7
rm horizontal conversions test
fivetran-reneeli Oct 16, 2024
ae9f8ec
add decision log entry for different grains
fivetran-reneeli Oct 21, 2024
34f0084
send to bk
fivetran-jamie Oct 23, 2024
d4d966d
Validation tests passing
fivetran-jamie Oct 23, 2024
f656fb8
tests
fivetran-jamie Oct 23, 2024
ce87da9
polishing
fivetran-jamie Oct 23, 2024
16c2387
buildkite
fivetran-jamie Oct 23, 2024
34bad83
seed data type
fivetran-jamie Oct 24, 2024
6fb9102
fix run steps
fivetran-jamie Oct 24, 2024
06bd8d8
changelog
fivetran-jamie Oct 24, 2024
80a676b
reword
fivetran-jamie Oct 24, 2024
a3aad23
joe feedback
fivetran-jamie Oct 24, 2024
325826c
source package is up on hub
fivetran-jamie Oct 24, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions .buildkite/scripts/run_models.sh
Original file line number Diff line number Diff line change
Expand Up @@ -19,4 +19,6 @@ dbt deps
dbt seed --target "$db" --full-refresh
dbt run --target "$db" --full-refresh
dbt test --target "$db"
dbt run --target "$db" --vars '{reddit_ads__conversion_event_types: []}' --full-refresh
dbt test --target "$db"
dbt run-operation fivetran_utils.drop_schemas_automation --target "$db"
30 changes: 30 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,33 @@
# dbt_reddit_ads v0.3.0
[PR #13](https://github.com/fivetran/dbt_reddit_ads/pull/13) includes the following **BREAKING CHANGE** updates:

## Features: Conversion Metrics
- Introduces the following conversion fields to the Reddit Ads `reddit_ads__<entity>_report` models:
- `conversions` (aliased from `click_through_conversion_attribution_window_month`): Total attributed click-through conversions for the given month-long window.
- `view_through_conversions` (aliased from `view_through_conversion_attribution_window_month`): Total attributed view-through conversions for the given month-long window.
- `total_value`: Total monetary value associated with a conversion event.
- `total_items`: Total number of items involved in a conversion event.
- Introduces the `<entity>_conversions_passthrough_metrics` variables to allow additional fields from the source `_conversion_report` tables. We use the maximum attribution window when considering conversions and therefore retrieve conversions metrics from the `click_through_conversion_attribution_window_month` (conversions) and `view_through_conversion_attribution_window_month` (view_through_conversions) fields from the respective source tables. For information on how to configure these variables to bring in additional windows and fields, refer to the [README](https://github.com/fivetran/dbt_reddit_ads/tree/main?tab=readme-ov-file#passing-through-additional-metrics).
- Introduces the `reddit_ads__conversion_event_types` variable to note which kinds of events should be considered conversions (and therefore be surfaced in conversion metrics). By default, this package considers `purchase`, `lead`, and `custom` events to be conversions. See [README](https://github.com/fivetran/dbt_reddit_ads/tree/main?tab=readme-ov-file#configure-conversion-event-types) for details on how to adjust this.

## Upstream Source Package Updates
To support the addition of conversion metrics here, [v0.3.0](https://github.com/fivetran/dbt_reddit_ads_source/releases/tag/v0.2.0) of `reddit_ads_source` included the following update:
- Introduces 4 new staging models to bring in conversion metrics (click-through conversions, view-through conversions, total value, and total items) across different dimensions:
- `stg_reddit_ads__account_conversions_report`
- `stg_reddit_ads__ad_group_conversions_report`
- `stg_reddit_ads__ad_conversions_report`
- `stg_reddit_ads__campaign_conversions_report`
> Note: If you would like to include conversion metrics, please ensure you have the `account_conversions_report`, `ad_group_conversions_report`, `ad_conversions_report`, and `campaign_conversions_report` source tables syncing in your Reddit Ads connector(s). Otherwise, the package will run successfully but produce `null` conversion metric values.

## Under the Hood
- Coalesces each pre-existing metrics (ie `clicks`, `impressions`, and `spend`) with `0` to avoid the complications of `null` in aggregations.
- Adds the respective seed data for the new models in addition to updating relevant documentation.
- Adds documentation explaining potential discrepancies across reporting grains.
- Adds new Buildkite run step to test different configurations of the `reddit_ads__conversion_event_types` variable.

## Contributors
- [Seer Interactive](https://www.seerinteractive.com/?utm_campaign=Fivetran%20%7C%20Models&utm_source=Fivetran&utm_medium=Fivetran%20Documentation)

# dbt_reddit_ads v0.2.1

[PR #8](https://github.com/fivetran/dbt_reddit_ads/pull/8) includes the following updates:
Expand Down
4 changes: 4 additions & 0 deletions DECISIONLOG.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
## Why don't metrics add up across different grains (Ex. ad level vs campaign level)?
When aggregating metrics like clicks and spend across different grains, discrepancies can arise due to differences in how data is captured, grouped, or attributed at each grain. For example, certain actions or costs might be attributed differently at the ad, campaign, or ad group level, leading to inconsistencies when rolled up. Additionally, for example, at the keyword grain, where a keyword can belong to multiple ad groups, aggregations can lead to over counting. Conversely, some ads may only be represented at the ad group level, rather than individual ad levels, leading to under counting at the ad grain.

This is a reason why we have broken out the ad reporting packages into separate hierarchical end models (Ad, Ad Group, Campaign, and more). Because if we only used ad-level reports, we could be missing data.
52 changes: 44 additions & 8 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -50,14 +50,17 @@ dispatch:
search_order: ['spark_utils', 'dbt_utils']
```

### Step 2: Install the package
Include the following reddit_ads package version in your `packages.yml` file:
### Step 2: Install the package (skip if also using the `ad_reporting` combo package)
If you are not using the downstream [Ad Reporting](https://github.com/fivetran/dbt_ad_reporting) combination package, include the following reddit_ads package version in your `packages.yml` file:
> TIP: Check [dbt Hub](https://hub.getdbt.com/) for the latest installation instructions or [read the dbt docs](https://docs.getdbt.com/docs/package-management) for more information on installing packages.
```yaml
packages:
- package: fivetran/reddit_ads
version: [">=0.2.0", "<0.3.0"]
version: [">=0.3.0", "<0.4.0"]
```

Do NOT include the `reddit_ads_source` package in this file. The transformation package itself has a dependency on it and will install the source package as well.

### Step 3: Define database and schema variables
By default, this package runs using your destination and the `reddit_ads` schema. If this is not where your Reddit Ads data is (for example, if your Reddit Ads schema is named `reddit_ads_fivetran`), add the following configuration to your root `dbt_project.yml` file:

Expand All @@ -68,6 +71,8 @@ vars:
```

### (Optional) Step 4: Additional configurations
<details open><summary>Expand/Collapse details</summary>

#### Union multiple connectors
If you have multiple reddit_ads connectors in Fivetran and would like to use this package on all of them simultaneously, we have provided functionality to do so. The package will union all of the data together and pass the unioned table into the transformations. You will be able to see which source it came from in the `source_relation` column of each model. To use this functionality, you will need to set either the `reddit_ads_union_schemas` OR `reddit_ads_union_databases` variables (cannot do both) in your root `dbt_project.yml` file:

Expand All @@ -80,10 +85,33 @@ vars:

To connect your multiple schema/database sources to the package models, follow the steps outlined in the [Union Data Defined Sources Configuration](https://github.com/fivetran/dbt_fivetran_utils/tree/releases/v0.4.latest#union_data-source) section of the Fivetran Utils documentation for the union_data macro. This will ensure a proper configuration and correct visualization of connections in the DAG.

#### Configure Conversion Event Types
By default, this package considers `purchase`, `lead`, and `custom` events from the `*_conversions_report` source tables to be conversions. This means that the package will only report values for conversion metrics (`conversions`, `total_items`, `total_value`, and `view_through_conversions`) for these 3 event types.

If you would like to adjust this so that the package reports conversions related to other types of [events](https://business.reddithelp.com/s/article/supported-conversion-events), or a subset of the default ones chosen, configure the `reddit_ads__conversion_event_types` variable:

```yml
vars:
reddit_ads__conversion_event_types:
- 'lead'
- 'search'
- 'sign_up'
- 'purchase'
- 'page_visit'
- 'add_to_cart'
- 'view_content'
- 'custom_event_<1-20>' # individual custom events
- 'custom' # AGGREGATION of all individual custom events = custom_event_1 + ... + custom_event_20
```

> Note: Please ensure due diligence when selecting conversion events, as some may overlap and introduce double-counted metrics if used together. For example, the `custom` event encapsulates all individual `custom_event_<1-20>` events.

#### Passing Through Additional Metrics
By default, this package will select `clicks`, `impressions`, and `spend` from the source reporting tables to store into the staging models. If you would like to pass through additional metrics to the staging models, add the following configurations to your `dbt_project.yml` file. These variables allow the pass-through fields to be aliased (`alias`) if desired, but not required. Use the following format for declaring the respective pass-through variables:
By default, this package will select `clicks`, `impressions`, `spend`, `conversions` (click_through_conversion_attribution_window_month), `view_through_conversions` (view_through_conversion_attribution_window_month), `total_items`, and `total_value` from the source reporting tables to store into the staging models. Note that we choose the maximum attribution window for counting conversions.

If you would like to pass through additional metrics to the staging models, for example, different attribution windows for conversions such as `view_through_conversion_attribution_window_week`, add the following configurations to your `dbt_project.yml` file. These variables allow the pass-through fields to be aliased (`alias`) if desired, but not required. Use the following format for declaring the respective pass-through variables:

> **NOTE** Ensure you exercised due diligence when adding metrics to these models. The metrics added by default (clicks, impressions, and cost) have been vetted by the Fivetran team maintaining this package for accuracy. There are metrics included within the source reports, for example, metric averages, which may be inaccurately represented at the grain for reports created in this package. You want to ensure whichever metrics you pass through are indeed appropriate to aggregate at the respective reporting levels provided in this package. Note that the aggregation we use for our reporting is `sum`.
> **NOTE** Ensure you exercised due diligence when adding metrics to these models. The metrics added by default (clicks, impressions, cost, conversions, view-through conversions, total items, and total value) have been vetted by the Fivetran team maintaining this package for accuracy. There are metrics included within the source reports, for example, metric averages, which may be inaccurately represented at the grain for reports created in this package. You want to ensure whichever metrics you pass through are indeed appropriate to aggregate at the respective reporting levels provided in this package. Note that the aggregation we use for our reporting is `sum`.

```yml
vars:
Expand All @@ -100,7 +128,7 @@ vars:
- name: "a_second_field"
```
#### Change the build schema
By default, this package builds the Reddit Ads staging models within a schema titled (`<target_schema>` + `_reddit_ads_source`) and your Reddit Ads modeling models within a schema titled (`<target_schema>` + `_reddit_ads`) in your destination. If this is not where you would like your Reddit Ads data to be written to, add the following configuration to your root `dbt_project.yml` file:
By default, this package builds the Reddit Ads staging models (12 views, 12 tables) within a schema titled (`<target_schema>` + `_reddit_ads_source`) and your Reddit Ads modeling models (5 tables) within a schema titled (`<target_schema>` + `_reddit_ads`) in your destination. If this is not where you would like your Reddit Ads data to be written to, add the following configuration to your root `dbt_project.yml` file:

```yml
models:
Expand All @@ -111,7 +139,7 @@ models:
```

#### Change the source table references
If an individual source table has a different name than the package expects, add the table name as it appears in your destination to the respective variable:
If an individual source table has a different name than the package expects, add the table name as it appears in your destination to the respective variable. This is not available when running the package on multiple unioned connectors.

> IMPORTANT: See this project's [`dbt_project.yml`](https://github.com/fivetran/dbt_reddit_ads_source/blob/main/dbt_project.yml) variable declarations to see the expected names.

Expand All @@ -120,6 +148,8 @@ vars:
reddit_ads_<default_source_table_name>_identifier: your_table_name
```

</details>

### (Optional) Step 5: Orchestrate your models with Fivetran Transformations for dbt Core™
<details><summary>Expand for more details</summary>

Expand All @@ -134,7 +164,7 @@ This dbt package is dependent on the following dbt packages. These dependencies
```yml
packages:
- package: fivetran/reddit_ads_source
version: [">=0.2.0", "<0.3.0"]
version: [">=0.3.0", "<0.4.0"]

- package: fivetran/fivetran_utils
version: [">=0.4.0", "<0.5.0"]
Expand All @@ -145,6 +175,7 @@ packages:
- package: dbt-labs/spark_utils
version: [">=0.3.0", "<0.4.0"]
```

## How is this package maintained and can I contribute?
### Package Maintenance
The Fivetran team maintaining this package _only_ maintains the latest version of the package. We highly recommend you stay consistent with the [latest version](https://hub.getdbt.com/fivetran/reddit_ads/latest/) of the package and refer to the [CHANGELOG](https://github.com/fivetran/dbt_reddit_ads/blob/main/CHANGELOG.md) and release notes for more information on changes across versions.
Expand All @@ -154,6 +185,11 @@ A small team of analytics engineers at Fivetran develops these dbt packages. How

We highly encourage and welcome contributions to this package. Check out [this dbt Discourse article](https://discourse.getdbt.com/t/contributing-to-a-dbt-package/657) on the best workflow for contributing to a package.

#### Contributors
We thank [everyone](https://github.com/fivetran/dbt_reddit_ads/graphs/contributors) who has taken the time to contribute. Each PR, bug report, and feature request has made this package better and is truly appreciated.

A special thank you to [Seer Interactive](https://www.seerinteractive.com/?utm_campaign=Fivetran%20%7C%20Models&utm_source=Fivetran&utm_medium=Fivetran%20Documentation), who we closely collaborated with to introduce native conversion support to our Ad packages.

## Are there any resources available?
- If you have questions or want to reach out for help, see the [GitHub Issue](https://github.com/fivetran/dbt_reddit_ads/issues/new/choose) section to find the right avenue of support for you.
- If you would like to provide feedback to the dbt package team at Fivetran or would like to request a new dbt package, fill out our [Feedback Form](https://www.surveymonkey.com/r/DQ7K7WW).
19 changes: 16 additions & 3 deletions dbt_project.yml
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
name: 'reddit_ads'
version: '0.2.1'
version: '0.3.0'
config-version: 2
require-dbt-version: [">=1.3.0", "<2.0.0"]
vars:
Expand All @@ -10,13 +10,26 @@ vars:
ad_group_daily_report: "{{ ref('stg_reddit_ads__ad_group_report') }}"
ad: "{{ ref('stg_reddit_ads__ad') }}"
ad_daily_report: "{{ ref('stg_reddit_ads__ad_report') }}"
campaign: "{{ ref('stg_reddit_ads__campaign') }}"
campaign: "{{ ref('stg_reddit_ads__campaign') }}"
campaign_daily_report: "{{ ref('stg_reddit_ads__campaign_report') }}"
post_daily_report: "{{ ref('stg_reddit_ads__post_report') }}"
account_conversions_report: "{{ ref('stg_reddit_ads__account_conversions_report') }}"
ad_group_conversions_report: "{{ ref('stg_reddit_ads__ad_group_conversions_report') }}"
ad_conversions_report: "{{ ref('stg_reddit_ads__ad_conversions_report') }}"
campaign_conversions_report: "{{ ref('stg_reddit_ads__campaign_conversions_report') }}"

reddit_ads__account_passthrough_metrics: []
reddit_ads__ad_group_passthrough_metrics: []
reddit_ads__ad_passthrough_metrics: []
reddit_ads__campaign_passthrough_metrics: []
reddit_ads__account_conversions_passthrough_metrics: []
reddit_ads__ad_group_conversions_passthrough_metrics: []
reddit_ads__ad_conversions_passthrough_metrics: []
reddit_ads__campaign_conversions_passthrough_metrics: []

reddit_ads__conversion_event_types:
- 'lead'
- 'purchase'
- 'custom'

models:
reddit_ads:
Expand Down
2 changes: 1 addition & 1 deletion docs/catalog.json

Large diffs are not rendered by default.

2 changes: 1 addition & 1 deletion docs/manifest.json

Large diffs are not rendered by default.

1 change: 0 additions & 1 deletion docs/run_results.json

This file was deleted.

10 changes: 5 additions & 5 deletions integration_tests/ci/sample.profiles.yml
Original file line number Diff line number Diff line change
Expand Up @@ -16,13 +16,13 @@ integration_tests:
pass: "{{ env_var('CI_REDSHIFT_DBT_PASS') }}"
dbname: "{{ env_var('CI_REDSHIFT_DBT_DBNAME') }}"
port: 5439
schema: reddit_ads_integration_tests_3
schema: reddit_ads_integration_tests_4
threads: 8
bigquery:
type: bigquery
method: service-account-json
project: 'dbt-package-testing'
schema: reddit_ads_integration_tests_3
schema: reddit_ads_integration_tests_4
threads: 8
keyfile_json: "{{ env_var('GCLOUD_SERVICE_KEY') | as_native }}"
snowflake:
Expand All @@ -33,7 +33,7 @@ integration_tests:
role: "{{ env_var('CI_SNOWFLAKE_DBT_ROLE') }}"
database: "{{ env_var('CI_SNOWFLAKE_DBT_DATABASE') }}"
warehouse: "{{ env_var('CI_SNOWFLAKE_DBT_WAREHOUSE') }}"
schema: reddit_ads_integration_tests_3
schema: reddit_ads_integration_tests_4
threads: 8
postgres:
type: postgres
Expand All @@ -42,13 +42,13 @@ integration_tests:
pass: "{{ env_var('CI_POSTGRES_DBT_PASS') }}"
dbname: "{{ env_var('CI_POSTGRES_DBT_DBNAME') }}"
port: 5432
schema: reddit_ads_integration_tests_3
schema: reddit_ads_integration_tests_4
threads: 8
databricks:
catalog: "{{ env_var('CI_DATABRICKS_DBT_CATALOG') }}"
host: "{{ env_var('CI_DATABRICKS_DBT_HOST') }}"
http_path: "{{ env_var('CI_DATABRICKS_DBT_HTTP_PATH') }}"
schema: reddit_ads_integration_tests_3
schema: reddit_ads_integration_tests_4
threads: 8
token: "{{ env_var('CI_DATABRICKS_DBT_TOKEN') }}"
type: databricks
Loading