-
Notifications
You must be signed in to change notification settings - Fork 5.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add the StorageIterator
type and implement for StorageVec
#6803
Conversation
StorageIterator
type and implement StorageIterator
for StorageVec
StorageIterator
type and implement for StorageVec
@@ -40,3 +40,52 @@ pub trait Iterator { | |||
/// ``` | |||
fn next(ref mut self) -> Option<Self::Item>; | |||
} | |||
|
|||
pub trait StorageIterator { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Introducing the StorageIterator
brings serious concerns. I understand the necessity, because, currently, if a trait function/method is pure, as the Iterator::next
is, then the implementation must also be pure, which storage access cannot satisfy.
We've discussed at the last offsite the side effects, like #[storage(read)]
, and how to properly bring them into the language and in particular into the type system. So far we do not have a conclusion, and IMO we need one before proceeding with the StorageIterator
.
Essentially, the #[storage]
side effect, as currently defined, introduces a strong coloring. If a trait can be implemented both with #[storage]
and without, we need to have two traits defined, as in the example of the Iterator
and the StorageIterator
.
This causes all kind of issues down the road. In the concrete example of the Iterator
, every common library that consumes Iterator
will need to be duplicated for StorageIterator
. Once we get closures, we will provide methods like map
, filter
, etc. But even now, without closures, we can provide methods like sum
on the Iterator
. All of those implementations will need to be duplicated to support StorageIterator
.
Also the for
loop. The fact that it works with the StorageIterator
is IMO more of a luck and the internal compiler implementation then the design decision. Essentially, unlike Rust, which insist on implementing Iterator
, Sway compiler searches for a next
method on the type. I've played a bit with providing next
methods on a type and ended up in ICEs when compiling for
loops. But that's a separate topic. My point is, if we decide to follow Rust here, which I think we should, the StorageIterator
will not work in loops anymore, or will need to be supported explicitely.
The question is how to proceed here. Since having iter()
on StorageVec
is more of a convenience, an easy workaround is to postpone it until we decide on the approach.
Another possibility is to redefine the rules around #[storage]
. Means the way we treat side effects in the type system. E.g., even if a trait function/method is not marked as #[storage]
an implementation can be. The downside is that a generic function that consumes trait method can be written as pure assuming pure implementation of a trait, but still produce an error that it needs to be marked as #[storage]
because of a concrete trait implementation. Note that in the end, after the monomorphisation, we always know if the used trait implementation is pure or not.
This is just a first idea. The general approach requires more thinking. E.g., what about ABIs, etc. As said, we tackled this topic at the offsite, but didn't come to a conclusion.
My fear is, if we just continue with the coloring and duplication of traits, we are opting for an approach that by design leads to difficulties. Think of it in general. A trait with two pure methods. One implementation wants to have one of them non-pure. Another implementation of another type wants to have the other one non-pure. Etc.
@IGI-111 What is your thought on this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here is one more reason for abandoning the iter()
on StorageVec
altogether, at least for now until we do not decide on the way we treat side effects in the language.
The assumption is, devs using iter()
will want to iterate over the whole vector. Once we get the Storage v3.0 we will have storage specific traits like DeepReadStorage
that will provide the most optimized way of accessing the whole stored content. We can even provide traits for optimized read of a continuous part of the content, etc.
My point is, once we have that, the most performant and convenient way to iterate over all the elements will be:
for elem in storage.some_storage_vec.deep_read().iter() { ... }
where the last iter()
is just a regular Iterator::iter()
over a slice.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While I don't disagree having two iterator traits is less than ideal, from a user perspective there will be no difference. There would be no required/breaking changes as the interface stays the same when fixed to the ideal solution later on. I would personally rather ship to users and deliver their needs/requests and come to a better solution once storage V3 is here.
This causes all kind of issues down the road. In the concrete example of the Iterator, every common library that consumes Iterator will need to be duplicated for StorageIterator.
As of right now, Iterator
is only implemented for a single type, Vec, so there is not a lot of work involved in getting these to be on par. The trait is also only used for heap types which all have an associated storage type already.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There would be no required/breaking changes as the interface stays the same when fixed to the ideal solution later on.
This is IMO a good counterargument. My concerns are about the general design issue that will cause future problems for sure. We don't know how far that future is, but you are right, there should be no breaking change, or least not a serious one, if we introduce the parallel, colored, implementation now. This is under assumption that devs will not start implementing StorageIterator
on their own storage types.
However, I would like to propose a solution that mitigates the issue of the parallel trait as well as the performance issues mentioned below. Please see the implementation in #6821 and let me know what you think. If you agree with that proposal I can add more tests, do the cleanup and finalize the PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmmm yes I like the solution in #6821 better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps one more thought on #6821, even if we do check for current length, this only potentially hides a logic issue if someone iterates and changes the collection at the same time. IMO, we should stress that in the docs, that changing a collection while iterating is a logic error.
If you agree we can close this PR in favor of #6821 which I will then finalize.
type Item = StorageKey<V>; | ||
#[storage(read)] | ||
fn next(ref mut self) -> Option<Self::Item> { | ||
if self.index >= self.values.len() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seriously increases the gas cost of an iteration, compared to a classical approach with the while
loop and an index.
In the classical approach, the len()
will be called, only once, before entering the loop. In this case, it is called for every iteration step.
I have strong concerns in providing a convenient way to iterate all elements, that can potentially double the gas cost of the iteration.
A simple solution can be to have a len
field in the StorageVecIter
and initialize it to the vector length in the iter()
. This is of course fragile if the vector gets changed during the iteration. But changing the collection while iterating over it is a logical error anyhow. Unfortunately, unlike Rust, Sway cannot forbid it on the language level.
} | ||
|
||
self.index += 1; | ||
let key = self.values.get(self.index - 1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To add to the performance considerations, every call to get()
will again check the length. Which means every next()
has two additional boundary checks.
The proposal to mitigate the boundary checks can be seen in #6821.
Closing in favor of #6821 |
Description
An internal contributor requested we add
Iter
to theStorageVec
type.Checklist
Breaking*
orNew Feature
labels where relevant.