-
Notifications
You must be signed in to change notification settings - Fork 5.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add DDL statements to drop branches and tags #23614
base: master
Are you sure you want to change the base?
Add DDL statements to drop branches and tags #23614
Conversation
Codenotify: Notifying subscribers in CODENOTIFY files for diff 003d86a...260b490.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! (docs)
Pull branch, local docs build, looks good. Thanks!
3fe4d56
to
ee54cd3
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some high level feedback. Most of the code looks good. I will take a closer look on 2nd pass.
Also, I wanted to bring a thought I had about parser extensions. Since many connectors will not support this branching and tagging, I was thinking that maybe we ought to consider designing a SQL syntax plugin extension interface. Spark allows custom syntax extensions through implementing some set of interfaces or bringing your own parser. The upstream iceberg project now maintains their own spark SQL syntax extensions.
I'm not proposing we need that for this PR, but maybe it's something we should start thinking about if connectors start adding more radically different features that would be best left to some syntax extensions/optional plugins, especially for things outside the SQL specification.
} | ||
|
||
@Override | ||
public void checkCanDropTag(ConnectorTransactionHandle transaction, ConnectorIdentity identity, AccessControlContext context, SchemaTableName tableName) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder about the granularity of these methods. In other implementation (e.g. spark?) at what granularity do they enforce the ability to do CRUD operations on tags and branches?
I'm thinking about a few cases
- A group or user(s) can only access a certain set of branches or tags
- A group or user(s) can only create branches starting from a specific branch
- A group or user(s) can create tags
I know we're only implementing DROP
but I want to understand the whole story for access control around branches and tags. Would we ever need to pass the branch/tag to these methods?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the granularity of access control methods in Spark for CRUD operations on Iceberg tags and branches is limited by Spark's integration with external systems (such as file systems, catalogs, and security frameworks like Apache Ranger).
For example, Ranger policies can define access controls at the table level, which could be extended to manage specific branches or tag-based access.
And like for cloud-based catalogs like AWS Glue, you can control access to Iceberg metadata (branches and tags) via IAM policies that grant or restrict specific operations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the info. Would the parameters passed here as context have enough information for us to act at a similar granularity? I don't see anything in the method parameters that contains the branch name which I assume we would need to perform access control at a similar level.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would like to discuss around this, Systems like Ranger can define access controls at the table level, column level. So in this case I think access of drop branch & tags could be table based. As per I can think of branch & tag level policies then has to be maintained on engine side if we introduce branch name / tag name in here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tdcmeehan What do you think about access control for branches and tags? Would be based on the parent table itself or based on the tags/branches?
My thinking was that since no policies are enforced based on branch/tags via security frameworks, should this honor the same access policies as table? Or if we don't even need access control exposed for dropTag & dropBranch?
presto-iceberg/src/main/java/com/facebook/presto/iceberg/IcebergAbstractMetadata.java
Outdated
Show resolved
Hide resolved
presto-iceberg/src/main/java/com/facebook/presto/iceberg/IcebergAbstractMetadata.java
Outdated
Show resolved
Hide resolved
presto-iceberg/src/main/java/com/facebook/presto/iceberg/IcebergAbstractMetadata.java
Outdated
Show resolved
Hide resolved
presto-spi/src/main/java/com/facebook/presto/spi/connector/ConnectorMetadata.java
Outdated
Show resolved
Hide resolved
presto-spi/src/main/java/com/facebook/presto/spi/connector/ConnectorMetadata.java
Outdated
Show resolved
Hide resolved
presto-iceberg/src/test/java/com/facebook/presto/iceberg/IcebergDistributedTestBase.java
Show resolved
Hide resolved
presto-iceberg/src/test/java/com/facebook/presto/iceberg/IcebergDistributedTestBase.java
Show resolved
Hide resolved
8f4d3fa
to
1e47c05
Compare
@ZacBlanco Can you please take another pass? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The whole change overall looks good to me. Some little nits, and one problem for discussing about the behavior of if exists
on branch
and tag
.
presto-parser/src/main/java/com/facebook/presto/sql/tree/DropBranch.java
Outdated
Show resolved
Hide resolved
presto-parser/src/main/java/com/facebook/presto/sql/tree/DropBranch.java
Outdated
Show resolved
Hide resolved
presto-parser/src/main/java/com/facebook/presto/sql/tree/DropBranch.java
Outdated
Show resolved
Hide resolved
presto-parser/src/main/java/com/facebook/presto/sql/tree/DropTag.java
Outdated
Show resolved
Hide resolved
presto-parser/src/main/java/com/facebook/presto/sql/tree/DropTag.java
Outdated
Show resolved
Hide resolved
presto-main/src/main/java/com/facebook/presto/execution/DropTagTask.java
Outdated
Show resolved
Hide resolved
presto-iceberg/src/main/java/com/facebook/presto/iceberg/IcebergAbstractMetadata.java
Outdated
Show resolved
Hide resolved
presto-iceberg/src/main/java/com/facebook/presto/iceberg/IcebergAbstractMetadata.java
Outdated
Show resolved
Hide resolved
presto-iceberg/src/test/java/com/facebook/presto/iceberg/IcebergDistributedTestBase.java
Outdated
Show resolved
Hide resolved
presto-iceberg/src/test/java/com/facebook/presto/iceberg/IcebergDistributedTestBase.java
Outdated
Show resolved
Hide resolved
0abecd5
to
5af0a0d
Compare
@hantangwangd Thanks for you review. I have addressed your comments, please review. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the fix. LGTM!
Hi @tdcmeehan, This is now waiting for the final committer review. Please review whenever you have time. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the PR, minor comments, and rest looks good.
|
||
Optional<MaterializedViewDefinition> optionalMaterializedView = metadata.getMetadataResolver(session).getMaterializedView(tableName); | ||
if (optionalMaterializedView.isPresent()) { | ||
if (!statement.isTableExists()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we throw an error in cases when if exists
is present in the query and it is a materialized view? If drop branch is just not supported for MVs, ideally error should be thrown irrespective of exists check?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alternate way of handling this can be introducing an enum Type in DropBranch and have only TABLE as the supported type.
return immediateFuture(null); | ||
} | ||
|
||
ConnectorId connectorId = metadata.getCatalogHandle(session, tableName.getCatalogName()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: the variable connectorId is not getting used and can be removed. Also I have refactored and extracted this piece of code in MetadataUtil class in this commit - dec2952.
So we can probably do a similar change here as well so that all the Task classes use the same methods?i
return immediateFuture(null); | ||
} | ||
|
||
ConnectorId connectorId = metadata.getCatalogHandle(session, tableName.getCatalogName()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: same as above.
|
||
Optional<MaterializedViewDefinition> optionalMaterializedView = metadata.getMetadataResolver(session).getMaterializedView(tableName); | ||
if (optionalMaterializedView.isPresent()) { | ||
if (!statement.isTableExists()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: same as above. What about the case when if exists
is present and it is MV?
260b490
5af0a0d
to
260b490
Compare
Description
Add DDL statements to drop branches and tags
Motivation and Context
Resolves #22028
Impact
Resolves #22028
SQL support for dropping a branch from a table :
SQL support for dropping a tag from a table :
Test Plan
Contributor checklist
Release Notes
Please follow release notes guidelines and fill in the release notes below.