Skip to content

Commit

Permalink
Merge branch 'master' into no-rows-updated-exception
Browse files Browse the repository at this point in the history
  • Loading branch information
david-leifker authored Jan 31, 2025
2 parents 56e481c + 1e0f993 commit f5545e0
Show file tree
Hide file tree
Showing 46 changed files with 829 additions and 142 deletions.
10 changes: 5 additions & 5 deletions .github/workflows/docker-unified.yml
Original file line number Diff line number Diff line change
Expand Up @@ -1039,12 +1039,12 @@ jobs:
cypress_matrix=$(printf "{\"test_strategy\":\"cypress\",\"batch\":\"0\",\"batch_count\":\"$cypress_batch_count\"}"; for ((i=1;i<cypress_batch_count;i++)); do printf ",{\"test_strategy\":\"cypress\", \"batch_count\":\"$cypress_batch_count\",\"batch\":\"%d\"}" $i; done)
includes=''
if [[ "${{ needs.setup.outputs.frontend_only }}" == 'true' ]]; then
includes=$cypress_matrix
elif [ "${{ needs.setup.outputs.ingestion_only }}" == 'true' ]; then
includes=$python_matrix
elif [[ "${{ needs.setup.outputs.backend_change }}" == 'true' || "${{ needs.setup.outputs.smoke_test_change }}" == 'true' ]]; then
if [[ "${{ needs.setup.outputs.backend_change }}" == 'true' || "${{ needs.setup.outputs.smoke_test_change }}" == 'true' || "${{ needs.setup.outputs.publish }}" == 'true' ]]; then
includes="$python_matrix,$cypress_matrix"
elif [[ "${{ needs.setup.outputs.frontend_only }}" == 'true' ]]; then
includes="$cypress_matrix"
elif [[ "${{ needs.setup.outputs.ingestion_only }}" == 'true' ]]; then
includes="$python_matrix"
fi
echo "matrix={\"include\":[$includes] }" >> "$GITHUB_OUTPUT"
Expand Down
2 changes: 1 addition & 1 deletion docker/datahub-frontend/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@ RUN if [ "${ALPINE_REPO_URL}" != "http://dl-cdn.alpinelinux.org/alpine" ] ; then

# Upgrade Alpine and base packages
# PFP-260: Upgrade Sqlite to >=3.28.0-r0 to fix https://security.snyk.io/vuln/SNYK-ALPINE39-SQLITE-449762
ENV JMX_VERSION=0.18.0
ENV JMX_VERSION=0.20.0
RUN apk --no-cache --update-cache --available upgrade \
&& apk --no-cache add curl sqlite libc6-compat snappy \
&& apk --no-cache add openjdk17-jre-headless --repository=${ALPINE_REPO_URL}/edge/community \
Expand Down
2 changes: 1 addition & 1 deletion docker/datahub-gms/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,7 @@ RUN go install github.com/jwilder/dockerize@$DOCKERIZE_VERSION

FROM alpine:3.21 AS base

ENV JMX_VERSION=0.18.0
ENV JMX_VERSION=0.20.0

# Re-declaring args from above to make them available in this stage (will inherit default values)
ARG ALPINE_REPO_URL
Expand Down
2 changes: 1 addition & 1 deletion docker/datahub-mae-consumer/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,7 @@ ARG MAVEN_CENTRAL_REPO_URL
RUN if [ "${ALPINE_REPO_URL}" != "http://dl-cdn.alpinelinux.org/alpine" ] ; then sed -i "s#http.*://dl-cdn.alpinelinux.org/alpine#${ALPINE_REPO_URL}#g" /etc/apk/repositories ; fi

# Upgrade Alpine and base packages
ENV JMX_VERSION=0.18.0
ENV JMX_VERSION=0.20.0
# PFP-260: Upgrade Sqlite to >=3.28.0-r0 to fix https://security.snyk.io/vuln/SNYK-ALPINE39-SQLITE-449762
RUN apk --no-cache --update-cache --available upgrade \
&& apk --no-cache add curl bash coreutils sqlite libc6-compat snappy \
Expand Down
2 changes: 1 addition & 1 deletion docker/datahub-mce-consumer/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,7 @@ ARG MAVEN_CENTRAL_REPO_URL
RUN if [ "${ALPINE_REPO_URL}" != "http://dl-cdn.alpinelinux.org/alpine" ] ; then sed -i "s#http.*://dl-cdn.alpinelinux.org/alpine#${ALPINE_REPO_URL}#g" /etc/apk/repositories ; fi

# Upgrade Alpine and base packages
ENV JMX_VERSION=0.18.0
ENV JMX_VERSION=0.20.0
# PFP-260: Upgrade Sqlite to >=3.28.0-r0 to fix https://security.snyk.io/vuln/SNYK-ALPINE39-SQLITE-449762
RUN apk --no-cache --update-cache --available upgrade \
&& apk --no-cache add curl bash sqlite libc6-compat snappy \
Expand Down
2 changes: 1 addition & 1 deletion docker/datahub-upgrade/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -33,7 +33,7 @@ ARG MAVEN_CENTRAL_REPO_URL
# Optionally set corporate mirror for apk
RUN if [ "${ALPINE_REPO_URL}" != "http://dl-cdn.alpinelinux.org/alpine" ] ; then sed -i "s#http.*://dl-cdn.alpinelinux.org/alpine#${ALPINE_REPO_URL}#g" /etc/apk/repositories ; fi

ENV JMX_VERSION=0.18.0
ENV JMX_VERSION=0.20.0

# Upgrade Alpine and base packages
# PFP-260: Upgrade Sqlite to >=3.28.0-r0 to fix https://security.snyk.io/vuln/SNYK-ALPINE39-SQLITE-449762
Expand Down
179 changes: 179 additions & 0 deletions docs/actions/events/entity-change-event.md
Original file line number Diff line number Diff line change
Expand Up @@ -417,3 +417,182 @@ This event is emitted when a new entity has been hard-deleted on DataHub.
}
}
```

## Action Request Events (Proposals)

Action Request events represent proposals for changes to entities that may require approval before being applied. These events have entityType "actionRequest" and use the `LIFECYCLE` category with `CREATE` operation.

### Domain Association Request Event

This event is emitted when a domain association is proposed for an entity on DataHub.

#### Sample Event
```json
{
"entityType": "actionRequest",
"entityUrn": "urn:li:actionRequest:abc-123",
"category": "LIFECYCLE",
"operation": "CREATE",
"auditStamp": {
"actor": "urn:li:corpuser:jdoe",
"time": 1234567890
},
"version": 0,
"parameters": {
"domains": "[\"urn:li:domain:marketing\"]",
"actionRequestType": "DOMAIN_ASSOCIATION",
"resourceUrn": "urn:li:dataset:(urn:li:dataPlatform:snowflake,example.table,PROD)",
"resourceType": "dataset"
}
}
```

### Owner Association Request Event

This event is emitted when an owner association is proposed for an entity on DataHub.

#### Sample Event
```json
{
"entityType": "actionRequest",
"entityUrn": "urn:li:actionRequest:def-456",
"category": "LIFECYCLE",
"operation": "CREATE",
"auditStamp": {
"actor": "urn:li:corpuser:jdoe",
"time": 1234567890
},
"version": 0,
"parameters": {
"owners": "[{\"type\":\"TECHNICAL_OWNER\",\"typeUrn\":\"urn:li:ownershipType:technical_owner\",\"ownerUrn\":\"urn:li:corpuser:jdoe\"}]",
"actionRequestType": "OWNER_ASSOCIATION",
"resourceUrn": "urn:li:dataset:(urn:li:dataPlatform:snowflake,example.table,PROD)",
"resourceType": "dataset"
}
}
```

### Tag Association Request Event

This event is emitted when a tag association is proposed for an entity on DataHub.

#### Sample Event
```json
{
"entityType": "actionRequest",
"entityUrn": "urn:li:actionRequest:ghi-789",
"category": "LIFECYCLE",
"operation": "CREATE",
"auditStamp": {
"actor": "urn:li:corpuser:jdoe",
"time": 1234567890
},
"version": 0,
"parameters": {
"actionRequestType": "TAG_ASSOCIATION",
"resourceUrn": "urn:li:dataset:(urn:li:dataPlatform:snowflake,example.table,PROD)",
"tagUrn": "urn:li:tag:pii",
"resourceType": "dataset"
}
}
```

### Create Glossary Term Request Event

This event is emitted when a new glossary term creation is proposed on DataHub.

#### Sample Event
```json
{
"entityType": "actionRequest",
"entityUrn": "urn:li:actionRequest:jkl-101",
"category": "LIFECYCLE",
"operation": "CREATE",
"auditStamp": {
"actor": "urn:li:corpuser:jdoe",
"time": 1234567890
},
"version": 0,
"parameters": {
"parentNodeUrn": "urn:li:glossaryNode:123",
"glossaryEntityName": "ExampleTerm",
"actionRequestType": "CREATE_GLOSSARY_TERM",
"resourceType": "glossaryTerm"
}
}
```

### Term Association Request Event

This event is emitted when a glossary term association is proposed for an entity on DataHub.

#### Sample Event
```json
{
"entityType": "actionRequest",
"entityUrn": "urn:li:actionRequest:mno-102",
"category": "LIFECYCLE",
"operation": "CREATE",
"auditStamp": {
"actor": "urn:li:corpuser:jdoe",
"time": 1234567890
},
"version": 0,
"parameters": {
"glossaryTermUrn": "urn:li:glossaryTerm:123",
"actionRequestType": "TERM_ASSOCIATION",
"resourceUrn": "urn:li:dataset:(urn:li:dataPlatform:snowflake,example.table,PROD)",
"resourceType": "dataset"
}
}
```

### Update Description Request Event

This event is emitted when an update to an entity's description is proposed on DataHub.

#### Sample Event
```json
{
"entityType": "actionRequest",
"entityUrn": "urn:li:actionRequest:pqr-103",
"category": "LIFECYCLE",
"operation": "CREATE",
"auditStamp": {
"actor": "urn:li:corpuser:jdoe",
"time": 1234567890
},
"version": 0,
"parameters": {
"description": "Example description for a dataset.",
"actionRequestType": "UPDATE_DESCRIPTION",
"resourceUrn": "urn:li:dataset:(urn:li:dataPlatform:snowflake,example.table,PROD)",
"resourceType": "dataset"
}
}
```

### Structured Property Association Request Event

This event is emitted when a structured property association is proposed for an entity on DataHub.

#### Sample Event
```json
{
"entityType": "actionRequest",
"entityUrn": "urn:li:actionRequest:stu-104",
"category": "LIFECYCLE",
"operation": "CREATE",
"auditStamp": {
"actor": "urn:li:corpuser:jdoe",
"time": 1234567890
},
"version": 0,
"parameters": {
"structuredProperties": "[{\"propertyUrn\":\"urn:li:structuredProperty:123\",\"values\":[\"value1\",\"value2\"]}]",
"actionRequestType": "STRUCTURED_PROPERTY_ASSOCIATION",
"resourceUrn": "urn:li:dataset:(urn:li:dataPlatform:snowflake,example.table,PROD)",
"resourceType": "dataset"
}
}
```
Original file line number Diff line number Diff line change
Expand Up @@ -102,9 +102,9 @@ In order to update the executor, ie. to deploy a new container version, you'll n

### Deploying on Kubernetes

The Helm chart [datahub-executor-worker](https://github.com/acryldata/datahub-executor-helm/tree/main/charts/datahub-executor-worker) can be used to deploy on a Kubernetes cluster. These instructions also apply for deploying to Amazon Elastic Kubernetes Service (EKS) or Google Kubernetes Engine (GKE).
The Helm chart [datahub-executor-worker](https://executor-helm.acryl.io/index.yaml) can be used to deploy on a Kubernetes cluster. These instructions also apply for deploying to Amazon Elastic Kubernetes Service (EKS) or Google Kubernetes Engine (GKE).

1. **Download Chart**: Download the [latest release](https://github.com/acryldata/datahub-executor-helm/releases) of the chart
1. **Download Chart**: Download the [latest release](https://executor-helm.acryl.io/index.yaml) of the chart
2. **Unpack the release archive**:
```
tar zxvf v0.0.4.tar.gz --strip-components=2
Expand Down
1 change: 0 additions & 1 deletion metadata-ingestion/docs/sources/mode/mode.md

This file was deleted.

21 changes: 21 additions & 0 deletions metadata-ingestion/docs/sources/mode/mode_pre.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
### Authentication

See Mode's [Authentication documentation](https://mode.com/developer/api-reference/authentication/) on how to generate an API `token` and `password`.

Mode does not support true "service accounts", so you must use a user account for authentication.
Depending on your requirements, you may want to create a dedicated user account for usage with DataHub ingestion.

### Permissions

DataHub ingestion requires the user to have the following permissions:

- Have at least the "Member" role.
- For each Connection, have at least"View" access.

To check Connection permissions, navigate to "Workspace Settings" → "Manage Connections". For each connection in the list, click on the connection → "Permissions". If the default workspace access is "View" or "Query", you're all set for that connection. If it's "Restricted", you'll need to individually grant your ingestion user View access.

- For each Space, have at least "View" access.

To check Collection permissions, navigate to the "My Collections" page as an Admin user. For each collection with Workspace Access set to "Restricted" access, the ingestion user must be manually granted the "Viewer" access in the "Manage Access" dialog. Collections with "All Members can View/Edit" do not need to be manually granted access.

Note that if the ingestion user has "Admin" access, then it will automatically have "View" access for all connections and collections.
2 changes: 1 addition & 1 deletion metadata-ingestion/examples/ai/dh_ai_client.py
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 +32,7 @@ class DatahubAIClient:

def __init__(
self,
token: str,
token: Optional[str] = None,
server_url: str = "http://localhost:8080",
platform: str = "mlflow",
) -> None:
Expand Down
2 changes: 1 addition & 1 deletion metadata-ingestion/examples/ai/dh_ai_client_sample.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@
if __name__ == "__main__":
# Example usage
parser = argparse.ArgumentParser()
parser.add_argument("--token", required=True, help="DataHub access token")
parser.add_argument("--token", required=False, help="DataHub access token")
parser.add_argument(
"--server_url",
required=False,
Expand Down
Loading

0 comments on commit f5545e0

Please sign in to comment.