Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(model): fixes DashboardContainsDashboard relationship in DashboardInfo aspect #12433

Open
wants to merge 12 commits into
base: master
Choose a base branch
from

Conversation

sgomezvillamor
Copy link
Contributor

@sgomezvillamor sgomezvillamor commented Jan 22, 2025

This fixes:

  • what is most likely a copy-paste issue in the DashboardContainsDashboard relationship model for DashboardInfo aspect model
  • a wrong validation in the DashboardPatchBuilder#add_chart_edge
  • adds the missing DashboardPatchBuilder#add_dashboard

Checklist

  • The PR conforms to DataHub's Contributing Guideline (particularly Commit Message Format)
  • Links to related issues (if applicable)
  • Tests for the changes have been added/updated (if applicable)
  • Docs related to the changes have been added/updated (if applicable). If a new feature has been added a Usage Guide has been added for the same.
  • For any breaking change/potential downtime/deprecation/big changes an entry has been made in Updating DataHub

Copy link

codecov bot commented Jan 22, 2025

Copy link
Collaborator

@mayurinehate mayurinehate left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks reasonable fix.
Hoping that re-ingestion of this aspect will automatically update graph index. Wondering if you've tested this.

@sgomezvillamor
Copy link
Contributor Author

Hoping that re-ingestion of this aspect will automatically update graph index. Wondering if you've tested this.

Not tested and not sure how to test

Copy link
Collaborator

@hsheth2 hsheth2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a note to the updating-datahub.md doc to call out that users of Sigma and PowerBI Apps ingestion may need to run a targeted reindex if they need updated metadata on these edges

I doubt there's many cases of that, but would be good to mention anyways

Copy link
Collaborator

@hsheth2 hsheth2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Might be worth getting a quick look from @RyanHolstien

docs/how/updating-datahub.md Outdated Show resolved Hide resolved
Copy link
Collaborator

@RyanHolstien RyanHolstien left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think typically we add an upgrade job that runs once to reindex these relationships when we update them, would also like to see some Java side unit tests

@@ -52,6 +55,22 @@ public DashboardInfoPatchBuilder removeDatasetEdge(@Nonnull DatasetUrn urn) {
return this;
}

// Simplified with just Urn
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's add tests for these

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found these tests for DashboardInfoPatchBuilder

I will add it there, even if they are all @Ignore.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

addressed in commit 49e0dc2

@@ -74,6 +75,12 @@ public JsonNode transformFields(JsonNode baseNode) {
DATASET_EDGES_FIELD_NAME,
Collections.singletonList(DESTINATION_URN_FIELD_NAME));

transformedNode =
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If there isn't already a test for this let's create one, or if there is add in mapping this field

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had already added a test for DashboardInfoTemplate in https://github.com/datahub-project/datahub/pull/12433/files#diff-b6808c495535fbf32144cbc4e6f16eb762b55c377c63ef929758d21760c28437

The test operates on applyPatch method, which goes through both transformFields and rebaseFields methods, as can be seen in

default T applyPatch(RecordTemplate recordTemplate, JsonPatch jsonPatch)
throws JsonProcessingException {
TemplateUtil.validatePatch(jsonPatch);
JsonNode transformed = populateTopLevelKeys(preprocessTemplate(recordTemplate), jsonPatch);
try {
// Hack in a more efficient patcher. Even with the serialization overhead 140% faster
JsonObject patched =
jsonPatch.apply(
Json.createReader(new StringReader(OBJECT_MAPPER.writeValueAsString(transformed)))
.readObject());
JsonNode postProcessed = rebaseFields(OBJECT_MAPPER.readTree(patched.toString()));
return RecordUtils.toRecordTemplate(getTemplateType(), postProcessed.toString());
} catch (JsonProcessingException e) {
throw new RuntimeException(
String.format(
"Error performing JSON PATCH on aspect %s. Patch: %s Target: %s",
recordTemplate.schema().getName(), jsonPatch, transformed.toString()),
e);
}
}
/**
* Returns a json representation of the template, modified for template based operations to be
* compatible with patch semantics.
*
* @param recordTemplate template to be transformed into json
* @return a {@link JsonNode} representation of the template
* @throws JsonProcessingException if there is an issue converting the input to JSON
*/
default JsonNode preprocessTemplate(RecordTemplate recordTemplate)
throws JsonProcessingException {
T subtype = getSubtype(recordTemplate);
JsonNode baseNode = OBJECT_MAPPER.readTree(RecordUtils.toJsonString(subtype));
return transformFields(baseNode);
}

@sgomezvillamor
Copy link
Contributor Author

I think typically we add an upgrade job that runs once to reindex these relationships when we update them, would also like to see some Java side unit tests

@RyanHolstien I have added the requested reindex job for the upgrade in commit b582b47

Considering we are fixing the DashboardContainsDashboard relationship in DashboardInfo aspect https://github.com/datahub-project/datahub/pull/12433/files#diff-6ae54ee7e3babcd8ac0b99538e8de6725f9fd82c0a119e517c20b2a23575e303 , can you confirm that DashboardInfo aspect reindexation will fix the relationship? or, how could I test and check this?

Copy link
Collaborator

@david-leifker david-leifker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants