v0.17.0 dbt_zendesk
fivetran-data-model-bot
released this
04 Sep 20:57
·
29 commits
to main
since this release
New model (#161)
- Addition of the
zendesk__document
model, designed to structure Zendesk textual data for vectorization and integration into NLP workflows. The model outputs a table with:document_id
: Corresponding to theticket_id
chunk_index
: For text segmentationchunk
: The text chunk itselfchunk_tokens_approximate
: Approximate token count for each segment
- This model is currently disabled by default. You may enable it by setting the
zendesk__unstructured_enabled
variable astrue
in yourdbt_project.yml
.- This model was developed with the limit of chunk sizes to approximately 5000 tokens for use with OpenAI, however you can change this limit by setting the variable
zendesk_max_tokens
in yourdbt_project.yml
. - See the README section Enabling the unstructured document model for NLP for more information.
- This model was developed with the limit of chunk sizes to approximately 5000 tokens for use with OpenAI, however you can change this limit by setting the variable
Breaking Changes (Full refresh required after upgrading)
-
Incremental models running on BigQuery have had the
partition_by
logic adjusted to include a granularity of a month. This change only impacts BigQuery warehouses and was applied to avoid the commontoo many partitions
error some users have experienced when partitioning by day. Therefore, adjusting the partition to a month granularity will decrease the number of partitions created and allow for more performant querying and incremental loads. This change was applied to the following models (#165):int_zendesk__field_calendar_spine
int_zendesk__field_history_pivot
zendesk__ticket_field_history
-
In the dbt_zendesk_source v0.12.0 release, the field
_fivetran_deleted
was added to the following models for use inzendesk__document
model (#161):stg_zendesk__ticket
stg_zendesk__ticket_comment
stg_zendesk__user
- If you have already added
_fivetran_deleted
as a passthrough column via thezendesk__ticket_passthrough_columns
orzendesk__user_passthrough_columns
variable, you will need to remove or alias this field from the variable to avoid duplicate column errors.
Bug Fixes
- Fixed an issue in the
zendesk__sla_policies
model where tickets that were opened and solved outside of scheduled hours were not being reported, specifically for the metricsrequester_wait_time
andagent_work_time
. - Fixed an issue in the
zendesk__ticket_metrics
model where certain tickets had miscalculated metrics.- Resolved by adjusting the join logic in models
int_zendesk__ticket_work_time_business
,int_zendesk__ticket_first_resolution_time_business
, andint_zendesk__ticket_full_resolution_time_business
. (#167)
- Resolved by adjusting the join logic in models
Under the hood
- Added integrity validations:
- Modified the
consistency_sla_policy_count
validation test to group byticket_id
for more accurate testing. (#165) - Updated casting in joins from timestamps to dates so that the whole day is considered. This produces more accurate results. (#164, #156, #167)
- Reduced the weeks looking ahead from 208 to 52 to improve performance, as tracking ticket SLAs beyond one year was unnecessary. (#156, #167)
- Updated seed files to reflect a real world ticket field history update scenario. (#165)
Full Changelog: v0.16.0...v0.17.0