Replies: 3 comments 1 reply
-
Looks good to me. This would simplify service lister implementations. I can see streaming API fetching metadata. I am interested in how you will solve asynchronous operations! |
Beta Was this translation helpful? Give feedback.
0 replies
-
I just ran into this with object_store integration stat'ing a ton of leftover files in a deltalake log as Version is not a property on ADLS, so +1! |
Beta Was this translation helpful? Give feedback.
0 replies
-
Thank you @erickguan and @alexwilcoxson-rel for the feedback. I'm preparing an RFC for this change. Proposed: #5313 |
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Metakey
is designed for users to query metadata from storage services. However, it is so complex that users struggle to use it correctly, while developers find it challenging to implement properly.When I designed
Metakey
, I expect users to use them like the following:Users can select the
meta
they want, and opendal will determine whether to fetch more as needed.However, this mechanism has been used incorrectly in many ways.
Extra stat call
Users aren't aware that setting
Metakey::Full
results in an additional stat call for each entry, which can make list performance up to 1000 times slower.Unexpected metakey used
Many users will use
Metakey
in this way:Overlap with
version
Lister now includes a new argument called
version(bool)
, which determines whether to list versioned objects. This somewhat overlaps withMetakey::Version
, potentially causing confusion.Although I still love
Metakey
, it is complex and can be easily misused in the wrong API, so it should not exist in OpenDAL.I suggest to remove this concept from the OpenDAL, and with the following changes:
MetadataCapability
which a struct like:Services can return
MetadataCapability
to indicate how much metadata will be included in the returned data. Users can use this capability to decide whether to perform extra stat or not.stat
duringlist
The list will no longer perform
stat
operations. Users who want this functionality can easily use theStream
API to compose them instead.After those changes, OpenDAL's
Lister
will be much simpler and more predictable.What do you think? I will propose an RFC if no objection.
Beta Was this translation helpful? Give feedback.
All reactions