Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DRC-1007] [Bug] Issue with Environment Variable in Recce row_count_diff Check #566

Open
MarcosShalionDE opened this issue Jan 2, 2025 · 2 comments
Labels
bug Something isn't working linear Created by Linear-GitHub Sync triage Triage required
Milestone

Comments

@MarcosShalionDE
Copy link

MarcosShalionDE commented Jan 2, 2025

Current Behavior

Hi team,
I am encountering an issue when using Recce to evaluate a PR. Specifically, I am seeing the following error :

"Env var required but not provided."

From my understanding, this happens because DBT includes the Jinja expressions in their raw form in the manifest.json file when the nodes are generated. While DBT correctly evaluates the variables at runtime, some configurations are recorded in the manifest as literal expressions, rather than being evaluated beforehand.

For example, in my dbt_project.yml I define the following configuration:

dbt_models:
  staging:
    +materialized: view
    +database: "{{env_var('DBT_ENVIRONMENT_NAME','DEV')}}_STAGING"

This results in the following configuration being written in the manifest.json:

"unrendered_config": {
    "copy_grants": true,
    "persist_docs": {...},
    "pre-hook": [...],
    "post-hook": [...],
    "materialized": "view",
    "database": "{{env_var('DBT_ENVIRONMENT_NAME','DEV')}}_STAGING"
}

Although the variable DBT_ENVIRONMENT_NAME is correctly defined in the environment, it seems that Recce cannot access or interpret it properly. This issue arises regardless of whether Recce is run locally or within a GitHub Action. The error suggests that the variable is required but not being passed correctly during the evaluation process. This may be due to DBT storing the Jinja expression as-is in the manifest.json file, leaving it unevaluated for Recce's runtime checks.
Running it locally:
image
Running it in Github Actions:
image

Questions

  1. Is there a way to pass the environment variable to Recce so that it can evaluate it?
  2. Can DBT be configured to avoid including unprocessed Jinja expressions in the manifest.json?

Any guidance or suggestions would be greatly appreciated!

Expected Behavior

Recce should correctly evaluate the environment variable DBT_ENVIRONMENT_NAME during the row_count_diff check, whether it is run locally or within a GitHub Action. This would allow the Jinja expressions in the DBT configuration to be resolved and prevent the "Env var required but not provided" error.

Steps To Reproduce

  1. Define the DBT_ENVIRONMENT_NAME variable in the environment.
  2. Configure dbt_project.yml to use the variable in a Jinja expression (e.g., +database: "{{env_var('DBT_ENVIRONMENT_NAME', 'DEV')}}_STAGING").
  3. Run the Recce row_count_diff check locally or within a GitHub Action.
  4. Observe the error: "Env var required but not provided."

Environment

  • recce: 0.46.0
  • OS: macOS 14.5
  • Python: 3.9.19
  • Data Warehouse: snowflake 8.47.1
  • dbt: 1.8.0

DRC-1007

@MarcosShalionDE MarcosShalionDE added bug Something isn't working triage Triage required labels Jan 2, 2025
@even-wei even-wei added the linear Created by Linear-GitHub Sync label Jan 3, 2025
@even-wei even-wei changed the title [Bug] Issue with Environment Variable in Recce row_count_diff Check [DRC-1007] [Bug] Issue with Environment Variable in Recce row_count_diff Check Jan 3, 2025
@even-wei even-wei added this to the v.55 milestone Jan 3, 2025
@wcchang1115
Copy link
Collaborator

wcchang1115 commented Jan 6, 2025

Hello MarcosShalionDE,

Thank you for providing such a detailed description of the problem and reproducing steps.
However, I can't reproduce the same situation with jaffle shop in our snowflake environment.

Here's the snippet of my configuration:

  • dbt_project.yml
 28 models:
 29   jaffle_shop:
 30     staging:
 31       +materialized: view
 32       +database: "{{env_var('DBT_ENVIRONMENT_NAME')}}_JAFFLE_SHOP"
 33     marts:
 34       +materialized: table
 35       +database: "{{env_var('DBT_ENVIRONMENT_NAME')}}_JAFFLE_SHOP"

and it resulted in

  • manifest.json
"unrendered_config": {
  "materialized": "view",
  "database": "{{env_var('DBT_ENVIRONMENT_NAME')}}_JAFFLE_SHOP"
},
"relation_name": "DEV_JAFFLE_SHOP.DEV_ANDY.stg_products",

Then we used the SQL select count(*) from DEV_JAFFLE_SHOP.DEV_ANDY.stg_products to do the check.

Could you please check if you get the correct relation_name when you running recce in your local environment?
Thank you!

@even-wei even-wei modified the milestones: v.55, v.56, v.57 Jan 7, 2025
@MarcosShalionDE
Copy link
Author

Hello [wcchang1115],

Thank you for the detailed follow-up and for sharing your configuration and process. I reviewed the issue again with my manager, and we discovered something that might be relevant to the problem.

In our setup, the DBT_ENVIRONMENT_NAME environment variable is not only used in the dbt_project.yml file but is also referenced within some models and macros. We suspect that this additional usage might be affecting how Recce processes or resolves the variable during runtime.

Example Macro

Here's an example of a macro that references the DBT_ENVIRONMENT_NAME variable:

{% macro get_environment_prefix() %}
  {{ env_var('DBT_ENVIRONMENT_NAME', 'DEV') }}
{% endmacro %}

Example Model

Here's an example of a model that directly uses DBT_ENVIRONMENT_NAME in its logic:
File: models/example_model.sql

SELECT
    id,
    name,
    category,
    '{{ env_var("DBT_ENVIRONMENT_NAME", "DEV") }}' AS environment_name
FROM
    {{ source('source_schema', 'source_table') }}

Could you help confirm this by repeating your tests but using a model or macro that references DBT_ENVIRONMENT_NAME? This might help us determine whether the issue is specific to its usage in dbt_project.yml or if it impacts other areas where the variable is used.

Your insights and guidance would be greatly appreciated as we continue troubleshooting!

Best regards,
Marcos Shalion

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working linear Created by Linear-GitHub Sync triage Triage required
Projects
None yet
Development

No branches or pull requests

3 participants