Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature/historical schedules #171

Merged
merged 82 commits into from
Oct 10, 2024
Merged
Changes from 76 commits
Commits
Show all changes
82 commits
Select commit Hold shift + click to select a range
605a7fb
send to bk
fivetran-jamie Jun 3, 2024
c58b4b6
try this out
fivetran-jamie Jun 3, 2024
5f2113b
first try
fivetran-jamie Jun 3, 2024
662f3f2
use mcro
fivetran-jamie Jun 3, 2024
5bfead9
Testing
fivetran-jamie Jun 3, 2024
d36dc49
puhs
fivetran-jamie Jun 3, 2024
700589f
redshift
fivetran-jamie Jun 3, 2024
f6ada0c
try nullif
fivetran-jamie Jun 4, 2024
f07271c
feature/historical-schedules
fivetran-catfritz Sep 11, 2024
f15b4b1
Merge branch 'feature/historical-schedules' into explore/audit-log-spike
fivetran-catfritz Sep 11, 2024
fd53978
Merge pull request #170 from fivetran/explore/audit-log-spike
fivetran-catfritz Sep 11, 2024
cf89429
rework schedule_history
fivetran-catfritz Sep 12, 2024
54cdf21
update unnest logic
fivetran-catfritz Sep 13, 2024
505f343
update schedule_history
fivetran-catfritz Sep 23, 2024
4ca4099
complete schedule_history
fivetran-catfritz Sep 23, 2024
7806d83
revise holidays
fivetran-catfritz Sep 26, 2024
a54bee2
add macro
fivetran-catfritz Sep 26, 2024
0477543
typeo
fivetran-catfritz Sep 26, 2024
f980433
updates
fivetran-catfritz Sep 26, 2024
016e449
updates
fivetran-catfritz Sep 26, 2024
45fbcfe
updates
fivetran-catfritz Sep 26, 2024
3907813
updates
fivetran-catfritz Sep 26, 2024
8b32c45
try out in buildkite
fivetran-jamie Sep 27, 2024
9c49399
remove schedule days during holiday
fivetran-catfritz Sep 27, 2024
e4bf09f
revise
fivetran-catfritz Sep 27, 2024
4523d2c
revise
fivetran-catfritz Sep 27, 2024
cceae3f
add config
fivetran-catfritz Sep 27, 2024
62c5fa9
add config
fivetran-catfritz Sep 27, 2024
1db42a7
allow disable holidays
fivetran-catfritz Sep 27, 2024
f730908
streamline bk run
fivetran-jamie Sep 27, 2024
ee99616
adjust for multiple holidays in a week
fivetran-catfritz Sep 27, 2024
4f395de
add casting
fivetran-catfritz Sep 27, 2024
8613f8a
add casting
fivetran-catfritz Sep 27, 2024
f762ca1
fixes
fivetran-catfritz Sep 27, 2024
00a98cb
fixes
fivetran-catfritz Sep 27, 2024
96be5ae
fixes
fivetran-catfritz Sep 27, 2024
7517057
fixes
fivetran-catfritz Sep 27, 2024
184e639
fixes
fivetran-catfritz Sep 28, 2024
c594f3f
fix multiyear schedules
fivetran-catfritz Sep 28, 2024
e613480
add longer holiday support
fivetran-catfritz Sep 30, 2024
a49f0d6
revert multiweek
fivetran-catfritz Oct 1, 2024
e1f9d32
adjust multiweek
fivetran-catfritz Oct 1, 2024
98e6872
account for non sunday week starts
fivetran-catfritz Oct 2, 2024
0321581
updates
fivetran-catfritz Oct 2, 2024
6163942
update weeks spanned calc
fivetran-catfritz Oct 2, 2024
c774f6d
update naming
fivetran-catfritz Oct 2, 2024
b8a9f4a
update to dbt_date
fivetran-catfritz Oct 3, 2024
cfcd106
update comments
fivetran-catfritz Oct 3, 2024
781f4fd
update to dbt date weekstart
fivetran-catfritz Oct 3, 2024
28457e8
modernize calendar spine
fivetran-catfritz Oct 3, 2024
c117df9
make sure we're working with strings when replacing
fivetran-jamie Oct 3, 2024
933d62e
let's see if bk works
fivetran-jamie Oct 4, 2024
0aacff4
postgres?
fivetran-jamie Oct 4, 2024
e39919e
i think schedule history may actually be workingggg
fivetran-catfritz Oct 4, 2024
8a9d84d
get ready to merge into catherines branch
fivetran-jamie Oct 4, 2024
f691904
Merge branch 'feature/historical-schedules' into feature/historical-s…
fivetran-jamie Oct 4, 2024
b3c97ab
postgres revert
fivetran-jamie Oct 4, 2024
94b11a2
Merge branch 'feature/historical-schedules-jamie-redshift' of https:/…
fivetran-jamie Oct 4, 2024
eff6401
add comments
fivetran-catfritz Oct 4, 2024
763ca33
redshift fixes
fivetran-catfritz Oct 4, 2024
238bd96
redshift fixes
fivetran-catfritz Oct 4, 2024
4f57203
redshift fixes
fivetran-catfritz Oct 4, 2024
a6f3536
Merge branch 'feature/historical-schedules' into feature/historical-s…
fivetran-catfritz Oct 4, 2024
e342c79
Merge pull request #172 from fivetran/feature/historical-schedules-ja…
fivetran-catfritz Oct 4, 2024
5c720b5
updates
fivetran-catfritz Oct 4, 2024
6d8ff79
validation update
fivetran-catfritz Oct 6, 2024
c100e51
split models
fivetran-catfritz Oct 6, 2024
56bb954
update ymls
fivetran-catfritz Oct 7, 2024
c4929cf
revise scchedule groups
fivetran-catfritz Oct 8, 2024
1d1b2e1
updates
fivetran-catfritz Oct 9, 2024
05fefd5
add inline comments
fivetran-catfritz Oct 9, 2024
d97f714
update decision log
fivetran-catfritz Oct 9, 2024
6573314
update changelog
fivetran-catfritz Oct 9, 2024
0b0a05d
regen docs
fivetran-catfritz Oct 9, 2024
246624b
fix yml
fivetran-catfritz Oct 9, 2024
888f109
update changelog
fivetran-catfritz Oct 9, 2024
af951b1
address review comments and regen docs
fivetran-catfritz Oct 9, 2024
855dc77
address review comments
fivetran-catfritz Oct 9, 2024
73845a9
Apply suggestions from code review
fivetran-catfritz Oct 10, 2024
fd879c5
address review comments
fivetran-catfritz Oct 10, 2024
360911f
Update packages.yml
fivetran-catfritz Oct 10, 2024
56cd177
release review updates
fivetran-catfritz Oct 10, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 4 additions & 2 deletions .buildkite/scripts/run_models.sh
Original file line number Diff line number Diff line change
@@ -18,8 +18,10 @@ cd integration_tests
dbt deps
dbt seed --target "$db" --full-refresh
dbt run --target "$db" --full-refresh
dbt run --target "$db"
dbt test --target "$db"
dbt run --vars '{zendesk__unstructured_enabled: true, using_schedules: false, using_domain_names: false, using_user_tags: false, using_ticket_form_history: false, using_organization_tags: false}' --target "$db" --full-refresh
dbt run --vars '{zendesk__unstructured_enabled: true, using_schedules: false, using_schedule_histories: false, using_domain_names: false, using_user_tags: false, using_ticket_form_history: false, using_organization_tags: false}' --target "$db" --full-refresh
dbt run --vars '{zendesk__unstructured_enabled: true, using_schedules: false, using_schedule_histories: false, using_domain_names: false, using_user_tags: false, using_ticket_form_history: false, using_organization_tags: false}' --target "$db"
dbt test --target "$db"

dbt run-operation fivetran_utils.drop_schemas_automation --target "$db"
# dbt run-operation fivetran_utils.drop_schemas_automation --target "$db"
5 changes: 4 additions & 1 deletion .quickstart/quickstart.yml
Original file line number Diff line number Diff line change
@@ -6,7 +6,6 @@ dbt_versions: ">=1.3.0 <2.0.0"
table_variables:
using_schedules:
- daylight_time
- schedule_holiday
- schedule
- time_zone
using_domain_names:
@@ -17,6 +16,10 @@ table_variables:
- ticket_form_history
using_organization_tags:
- organization_tag
using_schedule_histories:
- audit_log
using_holidays:
fivetran-joemarkiewicz marked this conversation as resolved.
Show resolved Hide resolved
- schedule_holiday

destination_configurations:
databricks:
36 changes: 35 additions & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,39 @@
# dbt_zendesk v0.17.0
# dbt_zendesk v0.18.0
[PR #171](https://github.com/fivetran/dbt_zendesk/pull/171) includes the following changes:

## Breaking Changes (Full refresh required after upgrading)
### Schedule Change Support
- Support for schedule changes has been added:
- Schedule changes are now extracted directly from the audit log, providing a view of schedule modifications over time.
- This feature is enabled by default, but can be easily turned off by setting `using_schedule_histories` to `false` in `dbt_project.yml`.
fivetran-joemarkiewicz marked this conversation as resolved.
Show resolved Hide resolved
- The `int_zendesk__schedule_spine` model is now enhanced to incorporate these schedule changes, making it possible for downstream models to reflect the most up-to-date schedule data.
- This improves granularity for Zendesk metrics related to agent availability, SLA tracking, and time-based performance analysis, allowing for more accurate reporting.
### dbt_zendesk_source changes (see the [Release Notes](https://github.com/fivetran/dbt_zendesk_source/releases/tag/v0.13.0) for more details)
- Added the `stg_zendesk__audit_log` table for capturing schedule changes. This is disabled when setting `using_schedule_histories` to `false` in `dbt_project.yml`.

## New Features
- Holiday support: Users can now choose to disable holiday tracking by setting `using_holidays` to `false` in `dbt_project.yml`.
- New intermediate models have been introduced to streamline both the readability and maintainability:
- `int_zendesk__timezone_daylight`: A utility model that maintains a record of daylight savings adjustments for each time zone.
- `int_zendesk__schedule_history`: Captures a full history of schedule changes for each `schedule_id`.
- `int_zendesk__schedule_timezones`: Merges schedule history with time zone shifts.
- `int_zendesk__schedule_holidays`: Identifies and calculates holiday periods for each schedule.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you mention the materialization of each of these so the customer knows the impact (if any) on their destination.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated.

fivetran-catfritz marked this conversation as resolved.
Show resolved Hide resolved
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- `int_zendesk__schedule_holidays`: Identifies and calculates holiday periods for each schedule.
- `int_zendesk__schedule_holiday`: Identifies and calculates holiday periods for each schedule.

Right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks updated!

- Rebuilt logic in `int_zendesk__schedule_spine` to consolidate updates from the new intermediate models.
### dbt_zendesk_source changes (see the [Release Notes](https://github.com/fivetran/dbt_zendesk_source/releases/tag/v0.13.0) for more details)
- Updated the `stg_zendesk__schedule_holidays` model to allow users to disable holiday processing by setting `using_holidays` to `false`.
- Added field-level documentation for the `stg_zendesk__audit_log` table.

## Bug Fixes
- Resolved a bug in the `int_zendesk__schedule_spine` model where users experienced large gaps in non-holiday periods. The updated logic addresses this issue.

fivetran-joemarkiewicz marked this conversation as resolved.
Show resolved Hide resolved
## Under the Hood
- Replaced instances of `dbt.date_trunc` with `dbt_date.week_start` to standardize week start dates to Sunday across all warehouses, since our schedule logic relies on consistent weeks.
- Replaced the deprecated `dbt.current_timestamp_backcompat()` function with `dbt.current_timestamp()` to ensure all timestamps are captured in UTC.
fivetran-jamie marked this conversation as resolved.
Show resolved Hide resolved
- Added seed data for `audit_log` to enhance integration testing capabilities.
- Introduced new helper macros, `clean_data` and `regex_extract`, to process complex text of the schedule changes extracted from audit logs.
- Updated `int_zendesk__calendar_spine` logic to prevent errors during compilation before the first full run, ensuring a smoother development experience.

# dbt_zendesk v0.17.0
## New model ([#161](https://github.com/fivetran/dbt_zendesk/pull/161))
- Addition of the `zendesk__document` model, designed to structure Zendesk textual data for vectorization and integration into NLP workflows. The model outputs a table with:
- `document_id`: Corresponding to the `ticket_id`
12 changes: 7 additions & 5 deletions DECISIONLOG.md
Original file line number Diff line number Diff line change
@@ -1,12 +1,14 @@
# Decision Log

## Tracking Ticket SLA Policies Into the Future
In our models we generate a future time series for ticket SLA policies. This is limited to a year to maintain performance.
## Schedule History
### Handling Multiple Schedule Changes in a Day
While integrating schedule changes from the audit_log source, we observed that multiple changes can occur on the same day, often when users are still finalizing a schedule. To maintain clarity and align with our day-based downstream logic, we decided to capture only the last change made on any given day. If this approach proves insufficient for your use case, please submit a feature request to enable support for multiple changes within a single day.
fivetran-joemarkiewicz marked this conversation as resolved.
Show resolved Hide resolved

## No Historical Schedule Reference
At the current moment the Fivetran Zendesk Support connector does not contain historical data of schedules. This means if a schedule is created in the Zendesk Support UI and remains untouched for years, but then is adjusted in the current month you will see the data synced in the raw `schedule` table to reflect the current adjusted schedule. As a result the raw data will lose all historical reference of what this schedule range was previously.
### Backfilling the Schedule History
Although the schedule history extracted from the audit log includes the most recent schedule, we exclude it in the `int_zendesk__schedule_history` model. Instead, we rely on the schedule from `stg_zendesk__schedule`, since it represents the live schedule. This approach also allows users who are not using schedule histories to easily disable the history feature. We join the live schedule with the schedule history model and bridge the valid_from and valid_until dates to maintain consistency.

Therefore, if you are leveraging the `using_schedule` variable as `true` to replicate business hour metrics this data model will only have a reference to the current range of any given schedule. This means tickets from the previous two years that were leveraging the __old__ schedule will not be reported as using the __new__ schedule. If this data limitation is a concern to you, we recommend opening a [Fivetran Support Feature Request](https://support.fivetran.com/hc/en-us/community/topics/360001909373-Feature-Requests?sort_by=votes) to enhance the Zendesk Support connector to include historical schedule data.
## Tracking Ticket SLA Policies Into the Future
In our models we generate a future time series for ticket SLA policies. This is limited to a year to maintain performance.

## Zendesk Support First Reply Time SLA Opinionated Logic
The logic for `first_reply_time` breach/achievement metrics within the `zendesk__ticket_metrics` and `zendesk__sla_policies` models are structured on the Zendesk Support definition of [first reply time SLA events](https://support.zendesk.com/hc/en-us/articles/4408821871642-Understanding-ticket-reply-time?page=2#topic_jvw_nqd_1hb). For example, this data model calculates first reply time to be the duration of time (business or calendar) between the creation of the ticket and the first public comment from either an `agent` or `admin`. This holds true regardless of when the first reply time SLA was applied to the ticket.
4 changes: 2 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
@@ -65,7 +65,7 @@ Include the following zendesk package version in your `packages.yml` file:
```yml
packages:
- package: fivetran/zendesk
version: [">=0.17.0", "<0.18.0"]
version: [">=0.18.0", "<0.19.0"]
fivetran-joemarkiewicz marked this conversation as resolved.
Show resolved Hide resolved
```
> **Note**: Do not include the Zendesk Support source package. The Zendesk Support transform package already has a dependency on the source in its own `packages.yml` file.
@@ -231,7 +231,7 @@ This dbt package is dependent on the following dbt packages. These dependencies
```yml
packages:
- package: fivetran/zendesk_source
version: [">=0.12.0", "<0.13.0"]
version: [">=0.13.0", "<0.14.0"]
- package: fivetran/fivetran_utils
version: [">=0.4.0", "<0.5.0"]
22 changes: 13 additions & 9 deletions dbt_project.yml
Original file line number Diff line number Diff line change
@@ -1,6 +1,5 @@
name: 'zendesk'
version: '0.17.0'

version: '0.18.0'

config-version: 2
require-dbt-version: [">=1.3.0", "<2.0.0"]
@@ -14,6 +13,10 @@ models:
intermediate:
+schema: zendesk_intermediate
+materialized: table
# int_zendesk__schedule_timezones:
# +materialized: ephemeral
# int_zendesk__schedule_holiday:
# +materialized: ephemeral
fivetran-joemarkiewicz marked this conversation as resolved.
Show resolved Hide resolved
reply_times:
+materialized: ephemeral
resolution_times:
@@ -33,23 +36,24 @@ vars:
zendesk:
ticket_field_history_columns: ['assignee_id', 'status', 'priority']
ticket_field_history_updater_columns: []
group: "{{ ref('stg_zendesk__group') }}"
audit_log: "{{ ref('stg_zendesk__audit_log') }}"
brand: "{{ ref('stg_zendesk__brand') }}"
daylight_time: "{{ ref('stg_zendesk__daylight_time') }}"
domain_name: "{{ ref('stg_zendesk__domain_name') }}"
field_history: "{{ ref('stg_zendesk__ticket_field_history') }}"
group: "{{ ref('stg_zendesk__group') }}"
organization_tag: "{{ ref('stg_zendesk__organization_tag') }}"
organization: "{{ ref('stg_zendesk__organization') }}"
schedule: "{{ ref('stg_zendesk__schedule') }}"
schedule_holiday: "{{ ref('stg_zendesk__schedule_holiday') }}"
ticket: "{{ ref('stg_zendesk__ticket') }}"
ticket_form_history: "{{ ref('stg_zendesk__ticket_form_history') }}"
schedule: "{{ ref('stg_zendesk__schedule') }}"
ticket_comment: "{{ ref('stg_zendesk__ticket_comment') }}"
field_history: "{{ ref('stg_zendesk__ticket_field_history') }}"
ticket_form_history: "{{ ref('stg_zendesk__ticket_form_history') }}"
ticket_schedule: "{{ ref('stg_zendesk__ticket_schedule') }}"
ticket_tag: "{{ ref('stg_zendesk__ticket_tag') }}"
ticket: "{{ ref('stg_zendesk__ticket') }}"
time_zone: "{{ ref('stg_zendesk__time_zone') }}"
user_tag: "{{ ref('stg_zendesk__user_tag') }}"
user: "{{ ref('stg_zendesk__user') }}"
daylight_time: "{{ ref('stg_zendesk__daylight_time') }}"
time_zone: "{{ ref('stg_zendesk__time_zone') }}"
using_schedules: true
using_domain_names: true
using_user_tags: true
2 changes: 1 addition & 1 deletion docs/catalog.json

Large diffs are not rendered by default.

2 changes: 1 addition & 1 deletion docs/manifest.json

Large diffs are not rendered by default.

1 change: 0 additions & 1 deletion docs/run_results.json

This file was deleted.

10 changes: 4 additions & 6 deletions integration_tests/dbt_project.yml
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
config-version: 2

name: 'zendesk_integration_tests'
version: '0.17.0'
version: '0.18.0'

profile: 'integration_tests'

@@ -25,6 +25,7 @@ vars:
zendesk_organization_tag_identifier: "organization_tag_data"
zendesk_user_identifier: "user_data"
zendesk_user_tag_identifier: "user_tag_data"
zendesk_audit_log_identifier: "audit_log_data"

## Uncomment for docs generation
# zendesk__unstructured_enabled: True
@@ -34,18 +35,15 @@ vars:
# using_domain_names: false
# using_user_tags: false
# using_organization_tags: false
# fivetran_integrity_sla_first_reply_time_exclusion_tickets: (1,56,80)
# fivetran_consistency_ticket_metrics_exclusion_tickets: (11092,11093,11094)
# fivetran_integrity_sla_count_match_tickets: (76)
# fivetran_integrity_sla_metric_parity_exclusion_tickets: (56,80)
# fivetran_integrity_sla_first_reply_time_exclusion_tickets: (56,80)

models:
+schema: "zendesk_{{ var('directed_schema','dev') }}"

seeds:
+quote_columns: "{{ true if target.type == 'redshift' else false }}"
zendesk_integration_tests:
+column_types:
_fivetran_synced: timestamp
+column_types:
_fivetran_synced: timestamp
group_data:
8 changes: 8 additions & 0 deletions integration_tests/seeds/audit_log_data.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
id,_fivetran_synced,action,actor_id,change_description,created_at,source_id,source_label,source_type
579796,2024-05-28 21:53:06.793000,update,37253,"Workweek changed from {:sun=&amp;gt;{""01:45""=&amp;gt;""02:45""}, :mon=&amp;gt;{""09:00""=&amp;gt;""20:00""}, :tue=&amp;gt;{""09:00""=&amp;gt;""20:00""}, :wed=&amp;gt;{""08:00""=&amp;gt;""20:00""}, :thu=&amp;gt;{""08:00""=&amp;gt;""20:00""}, :fri=&amp;gt;{""08:00""=&amp;gt;""20:00""}} to {:sun=&amp;gt;{""03:00""=&amp;gt;""04:00""}, :mon=&amp;gt;{""08:00""=&amp;gt;""20:00""}, :tue=&amp;gt;{""08:00""=&amp;gt;""20:00""}, :wed=&amp;gt;{""07:15""=&amp;gt;""20:00""}, :thu=&amp;gt;{""07:15""=&amp;gt;""20:00""}, :fri=&amp;gt;{""07:15""=&amp;gt;""20:00""}}",2024-05-28 21:51:37.000000,18542,Workweek: Central US Schedule,zendesk/business_hours/workweek
2679952,2024-05-28 16:18:58.471000,update,37253,"Workweek changed from {:thu=&amp;gt;{""09:00""=&amp;gt;""17:00""}, :fri=&amp;gt;{""09:00""=&amp;gt;""17:00""}, :mon=&amp;gt;{""09:00""=&amp;gt;""17:00""}, :tue=&amp;gt;{""09:00""=&amp;gt;""17:00""}, :wed=&amp;gt;{""09:00""=&amp;gt;""17:00""}} to {:mon=&amp;gt;{""09:00""=&amp;gt;""17:00""}, :tue=&amp;gt;{""09:00""=&amp;gt;""17:00""}, :wed=&amp;gt;{""09:00""=&amp;gt;""17:00""}, :thu=&amp;gt;{""09:00""=&amp;gt;""17:00""}, :fri=&amp;gt;{""09:00""=&amp;gt;""17:00""}}",2024-05-21 11:20:29.000000,267996,Workweek: New schedule here,zendesk/business_hours/workweek
293556,2024-05-28 16:18:58.471000,update,37253,"Workweek changed from {} to {:mon=&amp;gt;{""09:00""=&amp;gt;""17:00""}, :tue=&amp;gt;{""09:00""=&amp;gt;""17:00""}, :wed=&amp;gt;{""09:00""=&amp;gt;""17:00""}, :thu=&amp;gt;{""09:00""=&amp;gt;""17:00""}, :fri=&amp;gt;{""09:00""=&amp;gt;""17:00""}}",2024-05-21 11:20:28.000000,267996,Workweek: New schedule here,zendesk/business_hours/workweek
4441364,2024-05-28 16:18:58.471000,update,37253,"Workweek changed from {:wed=&amp;gt;{""09:00""=&amp;gt;""17:00""}, :thu=&amp;gt;{""09:00""=&amp;gt;""17:00""}, :mon=&amp;gt;{""09:00""=&amp;gt;""17:00""}, :tue=&amp;gt;{""09:00""=&amp;gt;""17:00""}, :fri=&amp;gt;{""09:00""=&amp;gt;""17:00""}} to {:mon=&amp;gt;{""09:00""=&amp;gt;""17:00""}, :tue=&amp;gt;{""09:00""=&amp;gt;""17:00""}, :wed=&amp;gt;{""09:00""=&amp;gt;""17:00""}, :thu=&amp;gt;{""09:00""=&amp;gt;""17:00""}, :fri=&amp;gt;{""09:00""=&amp;gt;""17:00""}}",2024-05-21 11:20:10.000000,267996,Workweek: New schedule 2,zendesk/business_hours/workweek
70900,2024-05-28 16:18:58.471000,update,37253,"Workweek changed from {} to {:mon=&amp;gt;{""09:00""=&amp;gt;""17:00""}, :tue=&amp;gt;{""09:00""=&amp;gt;""17:00""}, :wed=&amp;gt;{""09:00""=&amp;gt;""17:00""}, :thu=&amp;gt;{""09:00""=&amp;gt;""17:00""}, :fri=&amp;gt;{""09:00""=&amp;gt;""17:00""}}",2024-05-21 11:20:09.000000,267996,Workweek: New schedule 2,zendesk/business_hours/workweek
70901,2024-05-28 16:18:58.471000,update,37253,"Workweek changed from {&quot;mon&quot;:{&quot;10:00&quot;:&quot;20:00&quot;},&quot;tue&quot;:{&quot;10:00&quot;:&quot;20:00&quot;},&quot;wed&quot;:{&quot;10:00&quot;:&quot;20:00&quot;},&quot;thu&quot;:{&quot;10:00&quot;:&quot;20:00&quot;},&quot;fri&quot;:{&quot;10:00&quot;:&quot;20:00&quot;}} to {&quot;mon&quot;:{&quot;10:00&quot;:&quot;22:00&quot;},&quot;tue&quot;:{&quot;10:00&quot;:&quot;22:00&quot;},&quot;wed&quot;:{&quot;10:00&quot;:&quot;22:00&quot;},&quot;thu&quot;:{&quot;10:00&quot;:&quot;22:00&quot;},&quot;fri&quot;:{&quot;10:00&quot;:&quot;22:00&quot;}}",2024-05-21 11:20:09.000000,267996,Workweek: New schedule 2,zendesk/business_hours/workweek
70902,2024-05-28 16:18:58.471000,update,37253,"Workweek changed from {:mon=&amp;gt;{""09:00""=&amp;gt;""10:45"", ""11:45""=&amp;gt;""12:45"", ""13:45""=&amp;gt;""14:45"", ""15:15""=&amp;gt;""16:15"", ""19:00""=&amp;gt;""20:00"", ""17:30""=&amp;gt;""18:30""}, :tue=&amp;gt;{""00:15""=&amp;gt;""13:15"", ""13:30""=&amp;gt;""18:30"", ""18:45""=&amp;gt;""21:45"", ""22:00""=&amp;gt;""24:00""}, :wed=&amp;gt;{""09:00""=&amp;gt;""21:00""}, :thu=&amp;gt;{""17:00""=&amp;gt;""18:00"", ""19:45""=&amp;gt;""20:45"", ""09:00""=&amp;gt;""10:45"", ""12:15""=&amp;gt;""13:15"", ""14:30""=&amp;gt;""15:30""}, :fri=&amp;gt;{""09:00""=&amp;gt;""12:45"", ""19:15""=&amp;gt;""22:30"", ""14:45""=&amp;gt;""15:45"", ""17:30""=&amp;gt;""18:30""}} to {:mon=&amp;gt;{""09:00""=&amp;gt;""10:45"", ""11:45""=&amp;gt;""12:45"", ""13:45""=&amp;gt;""14:45"", ""15:15""=&amp;gt;""16:15"", ""17:30""=&amp;gt;""18:30"", ""19:00""=&amp;gt;""20:00""}, :tue=&amp;gt;{""00:15""=&amp;gt;""13:15"", ""13:30""=&amp;gt;""18:30"", ""18:45""=&amp;gt;""21:45"", ""22:00""=&amp;gt;""24:00""}, :wed=&amp;gt;{""02:30""=&amp;gt;""21:45""}, :thu=&amp;gt;{""09:00""=&amp;gt;""10:45"", ""12:15""=&amp;gt;""13:15"", ""14:30""=&amp;gt;""15:30"", ""17:00""=&amp;gt;""18:00"", ""19:45""=&amp;gt;""20:45""}, :fri=&amp;gt;{""09:00""=&amp;gt;""12:45"", ""14:45""=&amp;gt;""15:45"", ""17:30""=&amp;gt;""18:30"", ""19:15""=&amp;gt;""22:30""}}",2024-05-21 11:20:09.000000,267996,Workweek: New schedule 2,zendesk/business_hours/workweek
1 change: 0 additions & 1 deletion integration_tests/tests/integrity/metrics_count_match.sql
Original file line number Diff line number Diff line change
@@ -14,7 +14,6 @@ with stg_count as (
metric_count as (
select
count(*) as metric_ticket_count
from source
from {{ ref('zendesk__ticket_metrics') }}
)

7 changes: 7 additions & 0 deletions macros/clean_schedule.sql
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
{% macro clean_schedule(column_name) -%}
{{ return(adapter.dispatch('clean_schedule', 'zendesk')(column_name)) }}
{%- endmacro %}

{% macro default__clean_schedule(column_name) -%}
replace(replace(replace(replace(cast({{ column_name }} as {{ dbt.type_string() }}), '{', ''), '}', ''), '"', ''), ' ', '')
{%- endmacro %}
45 changes: 45 additions & 0 deletions macros/regex_extract.sql
Original file line number Diff line number Diff line change
@@ -0,0 +1,45 @@
{% macro regex_extract(string, day) -%}

{{ adapter.dispatch('regex_extract', 'zendesk') (string, day) }}

{%- endmacro %}

{% macro default__regex_extract(string, day) %}
{% set regex = "'.*?" ~ day ~ ".*?({.*?})'" %}
regexp_extract({{ string }}, {{ regex }} )

{%- endmacro %}

{% macro bigquery__regex_extract(string, day) %}
{% set regex = "'.*?" ~ day ~ ".*?({.*?})'" %}
regexp_extract({{ string }}, {{ regex }} )

{%- endmacro %}

{% macro snowflake__regex_extract(string, day) %}
{% set regex = "'.*?" ~ day ~ ".*?({.*?})'" %}

REGEXP_SUBSTR({{ string }}, {{ regex }}, 1, 1, 'e', 1 )

{%- endmacro %}

{% macro postgres__regex_extract(string, day) %}
{% set regex = "'.*?" ~ day ~ ".*?({.*?})'" %}

(regexp_matches({{ string }}, {{ regex }}))[1]

{%- endmacro %}

{% macro redshift__regex_extract(string, day) %}

{% set regex = '"' ~ day ~ '"' ~ ':\\\{([^\\\}]*)\\\}' -%}

'{' || REGEXP_SUBSTR({{ string }}, '{{ regex }}', 1, 1, 'e') || '}'

{%- endmacro %}

{% macro spark__regex_extract(string, day) %}
{% set regex = "'.*?" ~ day ~ ".*?({.*?})'" | replace("{", "\\\{") | replace("}", "\\\}") %}
regexp_extract({{ string }}, {{ regex }}, 1)

{%- endmacro %}
180 changes: 180 additions & 0 deletions models/intermediate/int_zendesk__schedule_history.sql
Original file line number Diff line number Diff line change
@@ -0,0 +1,180 @@
{{ config(enabled=fivetran_utils.enabled_vars(['using_schedules','using_schedule_histories'])) }}

with audit_logs as (
select
cast(source_id as {{ dbt.type_string() }}) as schedule_id,
created_at,
lower(change_description) as change_description
from {{ var('audit_log') }}
where lower(change_description) like '%workweek changed from%'

-- the formats for change_description vary, so it needs to be cleaned
), audit_logs_enhanced as (
select
schedule_id,
rank() over (partition by schedule_id order by created_at desc) as schedule_id_index,
created_at,
-- Clean up the change_description, sometimes has random html stuff in it
replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(change_description,
'workweek changed from', ''),
'&quot;', '"'),
'amp;', ''),
'=&gt;', ':'), ':mon:', '"mon":'), ':tue:', '"tue":'), ':wed:', '"wed":'), ':thu:', '"thu":'), ':fri:', '"fri":'), ':sat:', '"sat":'), ':sun:', '"sun":')
as change_description_cleaned
fivetran-joemarkiewicz marked this conversation as resolved.
Show resolved Hide resolved
from audit_logs

), split_to_from as (
select
schedule_id,
schedule_id_index,
created_at,
cast(created_at as date) as valid_from,
-- each change_description has two parts: 1-from the old schedule 2-to the new schedule.
{{ dbt.split_part('change_description_cleaned', "' to '", 1) }} as schedule_change_from,
{{ dbt.split_part('change_description_cleaned', "' to '", 2) }} as schedule_change
from audit_logs_enhanced

), find_same_day_changes as (
select
schedule_id,
schedule_id_index,
created_at,
valid_from,
schedule_change_from,
schedule_change,
row_number() over (
partition by schedule_id, valid_from -- valid from is type date
-- ordering to get the latest change when there are multiple on one day
order by schedule_id_index, schedule_change_from -- use the length of schedule_change_from to tie break, which will deprioritize empty "from" schedules
) as row_number
from split_to_from

-- multiple changes can occur on one day, so we will keep only the latest change in a day.
), consolidate_same_day_changes as (
select
schedule_id,
schedule_id_index,
created_at,
valid_from,
lead(valid_from) over (
partition by schedule_id order by schedule_id_index desc) as valid_until,
schedule_change
from find_same_day_changes
where row_number = 1

-- Creates a record for each day of the week for each schedule_change event.
-- This is done by iterating over the days of the week, extracting the corresponding
-- schedule data for each day, and unioning the results after each iteration.
), split_days as (
{% set days_of_week = {'sun': 0, 'mon': 1, 'tue': 2, 'wed': 3, 'thu': 4, 'fri': 5, 'sat': 6} %}
{% for day, day_number in days_of_week.items() %}
select
schedule_id,
schedule_id_index,
valid_from,
valid_until,
schedule_change,
'{{ day }}' as day_of_week,
cast('{{ day_number }}' as {{ dbt.type_int() }}) as day_of_week_number,
{{ zendesk.regex_extract('schedule_change', day) }} as day_of_week_schedule -- Extracts the schedule data specific to the current day from the schedule_change field.
from consolidate_same_day_changes
-- Exclude records with a null valid_until, which indicates it is the current schedule.
-- We will to pull in the live schedule downstream, which is necessary when not using schedule histories.
where valid_until is not null

{% if not loop.last %}union all{% endif %}
{% endfor %}

-- A single day may contain multiple start and stop times, so we need to generate a separate record for each.
-- The day_of_week_schedule is structured like a JSON string, requiring warehouse-specific logic to flatten it into individual records.
{% if target.type == 'redshift' %}
-- using PartiQL syntax to work with redshift's SUPER types, which requires an extra CTE
), redshift_parse_schedule as (
-- Redshift requires another CTE for unnesting
select
schedule_id,
schedule_id_index,
valid_from,
valid_until,
schedule_change,
day_of_week,
day_of_week_number,
day_of_week_schedule,
json_parse('[' || replace(replace(day_of_week_schedule, ', ', ','), ',', '},{') || ']') as json_schedule

from split_days
where day_of_week_schedule != '{}' -- exclude when the day_of_week_schedule in empty.

), unnested_schedules as (
select
schedule_id,
schedule_id_index,
valid_from,
valid_until,
schedule_change,
day_of_week,
day_of_week_number,
-- go back to strings
cast(day_of_week_schedule as {{ dbt.type_string() }}) as day_of_week_schedule,
{{ clean_schedule('JSON_SERIALIZE(unnested_schedule)') }} as cleaned_unnested_schedule

from redshift_parse_schedule as schedules, schedules.json_schedule as unnested_schedule

{% else %}
), unnested_schedules as (
select
split_days.*,

{%- if target.type == 'bigquery' %}
{{ clean_schedule('unnested_schedule') }} as cleaned_unnested_schedule
from split_days
cross join unnest(json_extract_array('[' || replace(day_of_week_schedule, ',', '},{') || ']', '$')) as unnested_schedule

{%- elif target.type == 'snowflake' %}
unnested_schedule.key || ':' || unnested_schedule.value as cleaned_unnested_schedule
from split_days
cross join lateral flatten(input => parse_json(replace(replace(day_of_week_schedule, '\}\}', '\}'), '\{\{', '\{'))) as unnested_schedule

{%- elif target.type == 'postgres' %}
{{ clean_schedule('unnested_schedule::text') }} as cleaned_unnested_schedule
from split_days
cross join lateral jsonb_array_elements(('[' || replace(day_of_week_schedule, ',', '},{') || ']')::jsonb) as unnested_schedule

{%- elif target.type in ('databricks', 'spark') %}
{{ clean_schedule('unnested_schedule') }} as cleaned_unnested_schedule
from split_days
lateral view explode(from_json(concat('[', replace(day_of_week_schedule, ',', '},{'), ']'), 'array<string>')) as unnested_schedule

{% else %}
cast(null as {{ dbt.type_string() }}) as cleaned_unnested_schedule
from split_days
{%- endif %}

{% endif %}

-- Each cleaned_unnested_schedule will have the format hh:mm:hh:mm, so we can extract each time part.
), split_times as (
select
unnested_schedules.*,
cast(nullif({{ dbt.split_part('cleaned_unnested_schedule', "':'", 1) }}, ' ') as {{ dbt.type_int() }}) as start_time_hh,
cast(nullif({{ dbt.split_part('cleaned_unnested_schedule', "':'", 2) }}, ' ') as {{ dbt.type_int() }}) as start_time_mm,
cast(nullif({{ dbt.split_part('cleaned_unnested_schedule', "':'", 3) }}, ' ') as {{ dbt.type_int() }}) as end_time_hh,
cast(nullif({{ dbt.split_part('cleaned_unnested_schedule', "':'", 4) }}, ' ') as {{ dbt.type_int() }}) as end_time_mm
from unnested_schedules

-- Calculate the start_time and end_time as minutes from Sunday
), calculate_start_end_times as (
select
schedule_id,
schedule_id_index,
start_time_hh * 60 + start_time_mm + 24 * 60 * day_of_week_number as start_time,
end_time_hh * 60 + end_time_mm + 24 * 60 * day_of_week_number as end_time,
valid_from,
valid_until,
day_of_week,
day_of_week_number
from split_times
)

select *
from calculate_start_end_times
111 changes: 111 additions & 0 deletions models/intermediate/int_zendesk__schedule_holiday.sql
Original file line number Diff line number Diff line change
@@ -0,0 +1,111 @@
{{ config(enabled=fivetran_utils.enabled_vars(['using_schedules','using_schedule_holidays'])) }}
fivetran-joemarkiewicz marked this conversation as resolved.
Show resolved Hide resolved

/*
The purpose of this model is to create a spine of appropriate timezone offsets to use for schedules, as offsets may
change due to Daylight Savings. End result will include `valid_from` and `valid_until` columns which we will use downstream
to determine which schedule-offset to associate with each ticket (ie standard time vs daylight time).
*/


with schedule as (
select *
from {{ var('schedule') }}

), schedule_holiday as (
select *
from {{ var('schedule_holiday') }}

-- Converts holiday_start_date_at and holiday_end_date_at into daily timestamps and finds the week starts/ends using week_start.
), schedule_holiday_ranges as (
select
holiday_name,
schedule_id,
cast({{ dbt.date_trunc('day', 'holiday_start_date_at') }} as {{ dbt.type_timestamp() }}) as holiday_valid_from,
cast({{ dbt.date_trunc('day', 'holiday_end_date_at') }} as {{ dbt.type_timestamp() }}) as holiday_valid_until,
cast({{ dbt_date.week_start('holiday_start_date_at','UTC') }} as {{ dbt.type_timestamp() }}) as holiday_starting_sunday,
cast({{ dbt_date.week_start(dbt.dateadd('week', 1, 'holiday_end_date_at'),'UTC') }} as {{ dbt.type_timestamp() }}) as holiday_ending_sunday,
-- Since the spine is based on weeks, holidays that span multiple weeks need to be broken up in to weeks. First step is to find those holidays.
{{ dbt.datediff('holiday_start_date_at', 'holiday_end_date_at', 'week') }} + 1 as holiday_weeks_spanned
from schedule_holiday

-- Creates a record for each week of multi-week holidays. Update valid_from and valid_until in the next cte.
), expanded_holidays as (
select
schedule_holiday_ranges.*,
cast(week_numbers.generated_number as {{ dbt.type_int() }}) as holiday_week_number
from schedule_holiday_ranges
-- Generate a sequence of numbers from 0 to the max number of weeks spanned, assuming a holiday won't span more than 52 weeks
cross join ({{ dbt_utils.generate_series(upper_bound=52) }}) as week_numbers
where schedule_holiday_ranges.holiday_weeks_spanned > 1
and week_numbers.generated_number <= schedule_holiday_ranges.holiday_weeks_spanned

-- Define start and end times for each segment of a multi-week holiday.
), split_multiweek_holidays as (

-- Business as usual for holidays that fall within a single week.
select
holiday_name,
schedule_id,
holiday_valid_from,
holiday_valid_until,
holiday_starting_sunday,
holiday_ending_sunday,
holiday_weeks_spanned
from schedule_holiday_ranges
where holiday_weeks_spanned = 1

union all

-- Split holidays by week that span multiple weeks since the schedule spine is based on weeks.
select
holiday_name,
schedule_id,
case
when holiday_week_number = 1 -- first week in multiweek holiday
then holiday_valid_from
-- We have to use days in case warehouse does not truncate to Sunday.
else cast({{ dbt.dateadd('day', '(holiday_week_number - 1) * 7', 'holiday_starting_sunday') }} as {{ dbt.type_timestamp() }})
end as holiday_valid_from,
case
when holiday_week_number = holiday_weeks_spanned -- last week in multiweek holiday
then holiday_valid_until
-- We have to use days in case warehouse does not truncate to Sunday.
else cast({{ dbt.dateadd('day', -1, dbt.dateadd('day', 'holiday_week_number * 7', 'holiday_starting_sunday')) }} as {{ dbt.type_timestamp() }}) -- saturday
end as holiday_valid_until,
case
when holiday_week_number = 1 -- first week in multiweek holiday
then holiday_starting_sunday
-- We have to use days in case warehouse does not truncate to Sunday.
else cast({{ dbt.dateadd('day', '(holiday_week_number - 1) * 7', 'holiday_starting_sunday') }} as {{ dbt.type_timestamp() }})
end as holiday_starting_sunday,
case
when holiday_week_number = holiday_weeks_spanned -- last week in multiweek holiday
then holiday_ending_sunday
-- We have to use days in case warehouse does not truncate to Sunday.
else cast({{ dbt.dateadd('day', 'holiday_week_number * 7', 'holiday_starting_sunday') }} as {{ dbt.type_timestamp() }})
end as holiday_ending_sunday,
holiday_weeks_spanned
from expanded_holidays
where holiday_weeks_spanned > 1

-- Create a record for each the holiday start and holiday end for each week to use downstream.
), split_holidays as (
-- Creates a record that will be used for the time before a holiday
select
split_multiweek_holidays.*,
holiday_valid_from as holiday_date,
'0_gap' as holiday_start_or_end
from split_multiweek_holidays

union all

-- Creates another record that will be used for the holiday itself
select
split_multiweek_holidays.*,
holiday_valid_until as holiday_date,
'1_holiday' as holiday_start_or_end
from split_multiweek_holidays
)

select *
from split_holidays
524 changes: 208 additions & 316 deletions models/intermediate/int_zendesk__schedule_spine.sql

Large diffs are not rendered by default.

278 changes: 278 additions & 0 deletions models/intermediate/int_zendesk__schedule_timezones.sql
Original file line number Diff line number Diff line change
@@ -0,0 +1,278 @@
{{ config(enabled=var('using_schedules', True)) }}

with split_timezones as (
select *
from {{ ref('int_zendesk__timezone_daylight') }}

), schedule as (
select
*,
max(created_at) over (partition by schedule_id) as max_created_at
from {{ var('schedule') }}

{% if var('using_schedule_histories', True) %}
), schedule_history as (
select *
from {{ ref('int_zendesk__schedule_history') }}

-- Select the most recent timezone associated with each schedule based on
-- the max_created_at timestamp. Historical timezone changes are not yet tracked.
), schedule_id_timezone as (
select
distinct schedule_id,
lower(time_zone) as time_zone,
schedule_name
from schedule
where created_at = max_created_at

-- Combine historical schedules with the most recent timezone data. Filter
-- out records where the timezone is missing, indicating the schedule has
-- been deleted.
), schedule_history_timezones as (
select
schedule_history.schedule_id,
schedule_history.schedule_id_index,
schedule_history.start_time,
schedule_history.end_time,
schedule_history.valid_from,
schedule_history.valid_until,
lower(schedule_id_timezone.time_zone) as time_zone,
schedule_id_timezone.schedule_name
from schedule_history
left join schedule_id_timezone
on schedule_id_timezone.schedule_id = schedule_history.schedule_id
-- We have to filter these records out since time math requires timezone
-- revisit later if this becomes a bigger issue
where time_zone is not null

-- Combine current schedules with historical schedules. Adjust the valid_from and valid_until dates accordingly.
), union_schedule_histories as (
select
schedule_id,
0 as schedule_id_index, -- set the index as 0 for the current schedule
created_at,
start_time,
end_time,
lower(time_zone) as time_zone,
schedule_name,
cast(null as date) as valid_from, -- created_at is when the schedule was first ever created, so we'll fill this value later
cast({{ dbt.current_timestamp() }} as date) as valid_until,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know you mentioned this was intentionally moved to the current timestamp whereas it was looking forward a year before. However, now there are chances where the current schedule only is valid until the current week starting sunday. This means we could be causing tickets in the current week to be excluded from downstream calculations.

For example, this is the most recent record I see for the Central Timezone schedule on 10/9
image

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me know if this is intentional, or if we should consider looking forward a bit for applicable schedules.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good callout. Added a look forward of a week.

False as is_historical
from schedule

union all

select
schedule_id,
schedule_id_index,
cast(null as {{ dbt.type_timestamp() }}) as created_at,
start_time,
end_time,
time_zone,
schedule_name,
cast(valid_from as date) as valid_from,
cast(valid_until as date) as valid_until,
True as is_historical
from schedule_history_timezones

-- Set the schedule_valid_from for current schedules based on the most recent historical row.
-- This allows the current schedule to pick up where the historical schedule left off.
), fill_current_schedule as (
select
schedule_id,
schedule_id_index,
start_time,
end_time,
time_zone,
schedule_name,
coalesce(case
when schedule_id_index = 0
-- get max valid_until from historical rows in the same schedule
then max(case when schedule_id_index > 0 then valid_until end)
over (partition by schedule_id)
else valid_from
end,
cast(created_at as date))
as schedule_valid_from,
valid_until as schedule_valid_until
from union_schedule_histories

-- Detect adjacent time periods by lagging the schedule_valid_until value
-- to identify effectively unchanged schedules.
), lag_valid_until as (
select
fill_current_schedule.*,
lag(schedule_valid_until) over (partition by schedule_id, start_time, end_time
order by schedule_valid_from, schedule_valid_until) as previous_valid_until
from fill_current_schedule

-- Identify distinct schedule groupings based on schedule_id, start_time, and end_time.
-- Consolidate only adjacent schedules; if a schedule changes and later reverts to its original time,
-- we want to maintain the intermediate schedule change.
), find_actual_changes as (
select
schedule_id,
schedule_id_index,
start_time,
end_time,
time_zone,
schedule_name,
schedule_valid_from,
schedule_valid_until,

-- The group_id increments only when there is a gap between the previous schedule's
-- valid_until and the current schedule's valid_from, signaling the schedules are not adjacent.
-- Adjacent schedules with the same start_time and end_time are grouped together,
-- while non-adjacent schedules are treated as separate groups.
sum(case when previous_valid_until = schedule_valid_from then 0 else 1 end) -- find if this row is adjacent to the previous row
over (partition by schedule_id, start_time, end_time
order by schedule_valid_from
rows between unbounded preceding and current row)
as group_id
from lag_valid_until

-- Consolidate records into continuous periods by finding the minimum
-- valid_from and maximum valid_until for each group.
), consolidate_changes as (
select
schedule_id,
start_time,
end_time,
time_zone,
schedule_name,
group_id,
min(schedule_id_index) as schedule_id_index, --helps with tracking downstream.
min(schedule_valid_from) as schedule_valid_from,
max(schedule_valid_until) as schedule_valid_until
from find_actual_changes
{{ dbt_utils.group_by(6) }}

-- For each schedule_id, reset the earliest schedule_valid_from date to 1970-01-01 for full schedule coverage.
), reset_schedule_start as (
select
schedule_id,
schedule_id_index,
time_zone,
schedule_name,
start_time,
end_time,
case
when schedule_valid_from = min(schedule_valid_from) over (partition by schedule_id) then '1970-01-01'
else schedule_valid_from
end as schedule_valid_from,
schedule_valid_until
from consolidate_changes

-- Adjust the schedule times to UTC by applying the timezone offset. Join all possible
-- time_zone matches for each schedule. The erroneous timezones will be filtered next.
), schedule_timezones as (
select
reset_schedule_start.schedule_id,
reset_schedule_start.schedule_id_index,
reset_schedule_start.time_zone,
reset_schedule_start.schedule_name,
coalesce(split_timezones.offset_minutes, 0) as offset_minutes,
reset_schedule_start.start_time - coalesce(split_timezones.offset_minutes, 0) as start_time_utc,
reset_schedule_start.end_time - coalesce(split_timezones.offset_minutes, 0) as end_time_utc,
cast(reset_schedule_start.schedule_valid_from as {{ dbt.type_timestamp() }}) as schedule_valid_from,
cast(reset_schedule_start.schedule_valid_until as {{ dbt.type_timestamp() }}) as schedule_valid_until,
-- we'll use these to determine which schedule version to associate tickets with.
cast({{ dbt.date_trunc('day', 'split_timezones.valid_from') }} as {{ dbt.type_timestamp() }}) as timezone_valid_from,
cast({{ dbt.date_trunc('day', 'split_timezones.valid_until') }} as {{ dbt.type_timestamp() }}) as timezone_valid_until
from reset_schedule_start
left join split_timezones
on split_timezones.time_zone = reset_schedule_start.time_zone

-- Assemble the final schedule-timezone relationship by determining the correct
-- schedule_valid_from and schedule_valid_until based on overlapping periods
-- between the schedule and timezone.
), final_schedule as (
select
schedule_id,
schedule_id_index,
time_zone,
schedule_name,
offset_minutes,
start_time_utc,
end_time_utc,
timezone_valid_from,
timezone_valid_until,
-- Be very careful if changing the order of these case whens--it does matter!
case
-- timezone that a schedule start falls within
when schedule_valid_from >= timezone_valid_from and schedule_valid_from < timezone_valid_until
then schedule_valid_from
-- timezone that a schedule end falls within
when schedule_valid_until >= timezone_valid_from and schedule_valid_until < timezone_valid_until
then timezone_valid_from
-- timezones that fall completely within the bounds of the schedule
when timezone_valid_from >= schedule_valid_from and timezone_valid_until < schedule_valid_until
then timezone_valid_from
end as schedule_valid_from,
case
-- timezone that a schedule end falls within
when schedule_valid_until >= timezone_valid_from and schedule_valid_until < timezone_valid_until
then schedule_valid_until
-- timezone that a schedule start falls within
when schedule_valid_from >= timezone_valid_from and schedule_valid_from < timezone_valid_until
then timezone_valid_until
-- timezones that fall completely within the bounds of the schedule
when timezone_valid_from >= schedule_valid_from and timezone_valid_until < schedule_valid_until
then timezone_valid_until
end as schedule_valid_until

from schedule_timezones

-- Filter records based on whether the schedule periods overlap with timezone periods. Capture
-- when a schedule start or end falls within a time zone, and also capture timezones that exist
-- entirely within the bounds of a schedule.
-- timezone that a schedule start falls within
where (schedule_valid_from >= timezone_valid_from and schedule_valid_from < timezone_valid_until)
-- timezone that a schedule end falls within
or (schedule_valid_until >= timezone_valid_from and schedule_valid_until < timezone_valid_until)
-- timezones that fall completely within the bounds of the schedule
or (timezone_valid_from >= schedule_valid_from and timezone_valid_until < schedule_valid_until)

{% else %} -- when not using schedule histories
), final_schedule as (
select
schedule.schedule_id,
0 as schedule_id_index,
lower(schedule.time_zone) as time_zone,
schedule.schedule_name,
coalesce(split_timezones.offset_minutes, 0) as offset_minutes,
schedule.start_time - coalesce(split_timezones.offset_minutes, 0) as start_time_utc,
schedule.end_time - coalesce(split_timezones.offset_minutes, 0) as end_time_utc,
cast({{ dbt.date_trunc('day', 'split_timezones.valid_from') }} as {{ dbt.type_timestamp() }}) as schedule_valid_from,
cast({{ dbt.date_trunc('day', 'split_timezones.valid_until') }} as {{ dbt.type_timestamp() }}) as schedule_valid_until,
cast({{ dbt.date_trunc('day', 'split_timezones.valid_from') }} as {{ dbt.type_timestamp() }}) as timezone_valid_from,
cast({{ dbt.date_trunc('day', 'split_timezones.valid_until') }} as {{ dbt.type_timestamp() }}) as timezone_valid_until
from schedule
left join split_timezones
on split_timezones.time_zone = lower(schedule.time_zone)
{% endif %}

), final as (
select
schedule_id,
schedule_id_index,
time_zone,
schedule_name,
offset_minutes,
start_time_utc,
end_time_utc,
schedule_valid_from,
schedule_valid_until,
-- use dbt_date.week_start to ensure we truncate to Sunday
cast({{ dbt_date.week_start('schedule_valid_from','UTC') }} as {{ dbt.type_timestamp() }}) as schedule_starting_sunday,
cast({{ dbt_date.week_start('schedule_valid_until','UTC') }} as {{ dbt.type_timestamp() }}) as schedule_ending_sunday,
-- Check if the start fo the schedule was from a schedule or timezone change for tracking downstream.
case when schedule_valid_from = timezone_valid_from
then 'timezone'
else 'schedule'
end as change_type
from final_schedule
)

select *
from final
Original file line number Diff line number Diff line change
@@ -15,7 +15,7 @@ with ticket_status_history as (
valid_ending_at,
{{ dbt.datediff(
'valid_starting_at',
"coalesce(valid_ending_at, " ~ dbt.current_timestamp_backcompat() ~ ")",
"coalesce(valid_ending_at, " ~ dbt.current_timestamp() ~ ")",
'minute') }} as status_duration_calendar_minutes,
value as status,
-- MIGHT BE ABLE TO DELETE ROWS BELOW
2 changes: 1 addition & 1 deletion models/intermediate/int_zendesk__ticket_schedules.sql
Original file line number Diff line number Diff line change
@@ -76,7 +76,7 @@ with ticket as (
schedule_id,
schedule_created_at,
coalesce(lead(schedule_created_at) over (partition by ticket_id order by schedule_created_at)
, {{ fivetran_utils.timestamp_add("hour", 1000, "" ~ dbt.current_timestamp_backcompat() ~ "") }} ) as schedule_invalidated_at
, {{ fivetran_utils.timestamp_add("hour", 1000, "" ~ dbt.current_timestamp() ~ "") }} ) as schedule_invalidated_at
from schedule_events

)
Original file line number Diff line number Diff line change
@@ -182,10 +182,9 @@ with agent_work_time_filtered_statuses as (
{{ fivetran_utils.timestamp_add(
"minute",
"cast(((7*24*60) * week_number) + breach_minutes_from_week as " ~ dbt.type_int() ~ " )",
"" ~ dbt.date_trunc('week', 'valid_starting_at') ~ "",
"cast(" ~ dbt_date.week_start('valid_starting_at','UTC') ~ " as " ~ dbt.type_timestamp() ~ " )"
) }} as sla_breach_at
from intercepted_periods_agent_filtered

)

select *
Original file line number Diff line number Diff line change
@@ -17,7 +17,7 @@ with agent_work_time_sla as (
greatest(ticket_historical_status.valid_starting_at, agent_work_time_sla.sla_applied_at) as valid_starting_at,
coalesce(
ticket_historical_status.valid_ending_at,
{{ fivetran_utils.timestamp_add('day', 30, "" ~ dbt.current_timestamp_backcompat() ~ "") }} ) as valid_ending_at, --assumes current status continues into the future. This is necessary to predict future SLA breaches (not just past).
{{ fivetran_utils.timestamp_add('day', 30, "" ~ dbt.current_timestamp() ~ "") }} ) as valid_ending_at, --assumes current status continues into the future. This is necessary to predict future SLA breaches (not just past).
ticket_historical_status.status as ticket_status,
agent_work_time_sla.sla_applied_at,
agent_work_time_sla.target,
2 changes: 1 addition & 1 deletion models/sla_policy/int_zendesk__sla_policy_applied.sql
Original file line number Diff line number Diff line change
@@ -47,7 +47,7 @@ with ticket_field_history as (
left join sla_policy_name
on sla_policy_name.ticket_id = sla_policy_applied.ticket_id
and sla_policy_applied.valid_starting_at >= sla_policy_name.valid_starting_at
and sla_policy_applied.valid_starting_at < coalesce(sla_policy_name.valid_ending_at, {{ dbt.current_timestamp_backcompat() }})
and sla_policy_applied.valid_starting_at < coalesce(sla_policy_name.valid_ending_at, {{ dbt.current_timestamp() }})
where sla_policy_applied.latest_sla = 1
)

Original file line number Diff line number Diff line change
@@ -173,7 +173,7 @@ with ticket_schedules as (
select
*,
schedule_end_time + remaining_minutes as breached_at_minutes,
{{ dbt.date_trunc('week', 'sla_applied_at') }} as starting_point,
{{ dbt_date.week_start('sla_applied_at','UTC') }} as starting_point,
{{ fivetran_utils.timestamp_add(
"minute",
"cast(((7*24*60) * week_number) + (schedule_end_time + remaining_minutes) as " ~ dbt.type_int() ~ " )",
Original file line number Diff line number Diff line change
@@ -182,7 +182,7 @@ with requester_wait_time_filtered_statuses as (
{{ fivetran_utils.timestamp_add(
"minute",
"cast(((7*24*60) * week_number) + breach_minutes_from_week as " ~ dbt.type_int() ~ " )",
"" ~ dbt.date_trunc('week', 'valid_starting_at') ~ "",
"cast(" ~ dbt_date.week_start('valid_starting_at','UTC') ~ " as " ~ dbt.type_timestamp() ~ " )"
) }} as sla_breach_at
from intercepted_periods_agent_filtered

Original file line number Diff line number Diff line change
@@ -17,7 +17,7 @@ with requester_wait_time_sla as (
greatest(ticket_historical_status.valid_starting_at, requester_wait_time_sla.sla_applied_at) as valid_starting_at,
coalesce(
ticket_historical_status.valid_ending_at,
{{ fivetran_utils.timestamp_add('day', 30, "" ~ dbt.current_timestamp_backcompat() ~ "") }} ) as valid_ending_at, --assumes current status continues into the future. This is necessary to predict future SLA breaches (not just past).
{{ fivetran_utils.timestamp_add('day', 30, "" ~ dbt.current_timestamp() ~ "") }} ) as valid_ending_at, --assumes current status continues into the future. This is necessary to predict future SLA breaches (not just past).
ticket_historical_status.status as ticket_status,
requester_wait_time_sla.sla_applied_at,
requester_wait_time_sla.target,
Original file line number Diff line number Diff line change
@@ -21,7 +21,7 @@ with calendar as (
select
*,
-- closed tickets cannot be re-opened or updated, and solved tickets are automatically closed after a pre-defined number of days configured in your Zendesk settings
cast( {{ dbt.date_trunc('day', "case when status != 'closed' then " ~ dbt.current_timestamp_backcompat() ~ " else updated_at end") }} as date) as open_until
cast( {{ dbt.date_trunc('day', "case when status != 'closed' then " ~ dbt.current_timestamp() ~ " else updated_at end") }} as date) as open_until
from {{ var('ticket') }}

), joined as (
41 changes: 19 additions & 22 deletions models/utils/int_zendesk__calendar_spine.sql
Original file line number Diff line number Diff line change
@@ -1,42 +1,39 @@
-- depends_on: {{ source('zendesk', 'ticket') }}

-- depends_on: {{ var('ticket') }}
with spine as (

{% if execute %}
{% set current_ts = dbt.current_timestamp_backcompat() %}
{% set first_date_query %}
select min( created_at ) as min_date from {{ source('zendesk', 'ticket') }}
-- by default take all the data
where cast(created_at as date) >= {{ dbt.dateadd('year', - var('ticket_field_history_timeframe_years', 50), current_ts ) }}
{% endset %}
{% if execute and flags.WHICH in ('run', 'build') %}

{% set first_date = run_query(first_date_query).columns[0][0]|string %}

{% if target.type == 'postgres' %}
{% set first_date_adjust = "cast('" ~ first_date[0:10] ~ "' as date)" %}
{%- set first_date_query %}
select
coalesce(
min(cast(created_at as date)),
cast({{ dbt.dateadd("month", -1, "current_date") }} as date)
) as min_date
from {{ var('ticket') }}
-- by default take all the data
where cast(created_at as date) >= {{ dbt.dateadd('year',
- var('ticket_field_history_timeframe_years', 50), "current_date") }}
{% endset -%}

{% else %}
{% set first_date_adjust = "'" ~ first_date[0:10] ~ "'" %}
{%- set first_date = dbt_utils.get_single_value(first_date_query) %}

{% endif %}
{% else %}
{%- set first_date = '2016-01-01' %}

{% else %} {% set first_date_adjust = "2016-01-01" %}
{% endif %}


{{
dbt_utils.date_spine(
datepart = "day",
start_date = first_date_adjust,
start_date = "cast('" ~ first_date ~ "' as date)",
end_date = dbt.dateadd("week", 1, "current_date")
)
}}

), recast as (

select cast(date_day as date) as date_day
select
cast(date_day as date) as date_day
from spine

)

select *
97 changes: 97 additions & 0 deletions models/utils/int_zendesk__timezone_daylight.sql
Original file line number Diff line number Diff line change
@@ -0,0 +1,97 @@
{{ config(enabled=var('using_schedules', True)) }}

with timezone as (

select *
from {{ var('time_zone') }}

), daylight_time as (

select *
from {{ var('daylight_time') }}

), timezone_with_dt as (

select
timezone.*,
daylight_time.daylight_start_utc,
daylight_time.daylight_end_utc,
daylight_time.daylight_offset_minutes

from timezone
left join daylight_time
on timezone.time_zone = daylight_time.time_zone

), order_timezone_dt as (

select
*,
-- will be null for timezones without any daylight savings records (and the first entry)
-- we will coalesce the first entry date with .... the X years ago
lag(daylight_end_utc, 1) over (partition by time_zone order by daylight_end_utc asc) as last_daylight_end_utc,
-- will be null for timezones without any daylight savings records (and the last entry)
-- we will coalesce the last entry date with the current date
lead(daylight_start_utc, 1) over (partition by time_zone order by daylight_start_utc asc) as next_daylight_start_utc

from timezone_with_dt

), split_timezones as (

-- standard (includes timezones without DT)
-- starts: when the last Daylight Savings ended
-- ends: when the next Daylight Savings starts
select
time_zone,
standard_offset_minutes as offset_minutes,

-- last_daylight_end_utc is null for the first record of the time_zone's daylight time, or if the TZ doesn't use DT
coalesce(last_daylight_end_utc, cast('1970-01-01' as date)) as valid_from,

-- daylight_start_utc is null for timezones that don't use DT
coalesce(daylight_start_utc, cast( {{ dbt.dateadd('year', 1, dbt.current_timestamp()) }} as date)) as valid_until

from order_timezone_dt

union all

-- DT (excludes timezones without it)
-- starts: when this Daylight Savings started
-- ends: when this Daylight Savings ends
select
time_zone,
-- Pacific Time is -8h during standard time and -7h during DT
standard_offset_minutes + daylight_offset_minutes as offset_minutes,
daylight_start_utc as valid_from,
daylight_end_utc as valid_until

from order_timezone_dt
where daylight_offset_minutes is not null

union all

select
time_zone,
standard_offset_minutes as offset_minutes,

-- Get the latest daylight_end_utc time and set that as the valid_from
max(daylight_end_utc) as valid_from,

-- If the latest_daylight_end_time_utc is less than todays timestamp, that means DST has ended. Therefore, we will make the valid_until in the future.
cast( {{ dbt.dateadd('year', 1, dbt.current_timestamp()) }} as date) as valid_until

from order_timezone_dt
group by 1, 2
-- We only want to apply this logic to time_zone's that had daylight saving time and it ended at a point. For example, Hong Kong ended DST in 1979.
having cast(max(daylight_end_utc) as date) < cast({{ dbt.current_timestamp() }} as date)

), final as (
select
lower(time_zone) as time_zone,
offset_minutes,
cast(valid_from as {{ dbt.type_timestamp() }}) as valid_from,
cast(valid_until as {{ dbt.type_timestamp() }}) as valid_until
from split_timezones
)

select *
from final
4 changes: 2 additions & 2 deletions models/zendesk__sla_policies.sql
Original file line number Diff line number Diff line change
@@ -123,11 +123,11 @@ select
in_business_hours,
sla_breach_at,
case when sla_elapsed_time is null
then ({{ dbt.datediff("sla_applied_at", dbt.current_timestamp_backcompat(), 'second') }} / 60) --This will create an entry for active sla's
then ({{ dbt.datediff("sla_applied_at", dbt.current_timestamp(), 'second') }} / 60) --This will create an entry for active sla's
else sla_elapsed_time
end as sla_elapsed_time,
sla_breach_at > current_timestamp as is_active_sla,
case when (sla_breach_at > {{ dbt.current_timestamp_backcompat() }})
case when (sla_breach_at > {{ dbt.current_timestamp() }})
then null
else is_sla_breached
end as is_sla_breach
8 changes: 4 additions & 4 deletions models/zendesk__ticket_metrics.sql
Original file line number Diff line number Diff line change
@@ -104,16 +104,16 @@ select
coalesce(ticket_comments.count_agent_replies, 0) as total_agent_replies,

case when ticket_enriched.is_requester_active = true and ticket_enriched.requester_last_login_at is not null
then ({{ dbt.datediff("ticket_enriched.requester_last_login_at", dbt.current_timestamp_backcompat(), 'second') }} /60)
then ({{ dbt.datediff("ticket_enriched.requester_last_login_at", dbt.current_timestamp(), 'second') }} /60)
end as requester_last_login_age_minutes,
case when ticket_enriched.is_assignee_active = true and ticket_enriched.assignee_last_login_at is not null
then ({{ dbt.datediff("ticket_enriched.assignee_last_login_at", dbt.current_timestamp_backcompat(), 'second') }} /60)
then ({{ dbt.datediff("ticket_enriched.assignee_last_login_at", dbt.current_timestamp(), 'second') }} /60)
end as assignee_last_login_age_minutes,
case when lower(ticket_enriched.status) not in ('solved','closed')
then ({{ dbt.datediff("ticket_enriched.created_at", dbt.current_timestamp_backcompat(), 'second') }} /60)
then ({{ dbt.datediff("ticket_enriched.created_at", dbt.current_timestamp(), 'second') }} /60)
end as unsolved_ticket_age_minutes,
case when lower(ticket_enriched.status) not in ('solved','closed')
then ({{ dbt.datediff("ticket_enriched.updated_at", dbt.current_timestamp_backcompat(), 'second') }} /60)
then ({{ dbt.datediff("ticket_enriched.updated_at", dbt.current_timestamp(), 'second') }} /60)
end as unsolved_ticket_age_since_update_minutes,
case when lower(ticket_enriched.status) in ('solved','closed') and ticket_comments.is_one_touch_resolution
then true
7 changes: 5 additions & 2 deletions packages.yml
Original file line number Diff line number Diff line change
@@ -1,5 +1,8 @@
packages:
- package: fivetran/zendesk_source
version: [">=0.12.0", "<0.13.0"]
# - package: fivetran/zendesk_source
# version: [">=0.12.0", "<0.13.0"]
- git: https://github.com/fivetran/dbt_zendesk_source.git
revision: feature/historical-schedules
warn-unpinned: false
fivetran-catfritz marked this conversation as resolved.
Show resolved Hide resolved
- package: calogica/dbt_date
version: [">=0.9.0", "<1.0.0"]