Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add the StorageIterator type and implement for StorageVec #6803

Closed
wants to merge 5 commits into from

Conversation

bitzoic
Copy link
Member

@bitzoic bitzoic commented Jan 2, 2025

Description

An internal contributor requested we add Iter to the StorageVec type.

Checklist

  • I have linked to any relevant issues.
  • I have commented my code, particularly in hard-to-understand areas.
  • I have updated the documentation where relevant (API docs, the reference, and the Sway book).
  • I have added tests that prove my fix is effective or that my feature works.
  • I have added (or requested a maintainer to add) the necessary Breaking* or New Feature labels where relevant.
  • I have done my best to ensure that my PR adheres to the Fuel Labs Code Review Standards.
  • I have requested a review from the relevant team or maintainers.

@bitzoic bitzoic added the lib: std Standard library label Jan 2, 2025
@bitzoic bitzoic self-assigned this Jan 2, 2025
@bitzoic bitzoic requested review from a team as code owners January 2, 2025 10:40
@bitzoic bitzoic linked an issue Jan 2, 2025 that may be closed by this pull request
@bitzoic bitzoic changed the title Add the StorageIterator type and implement StorageIterator for StorageVec Add the StorageIterator type and implement for StorageVec Jan 2, 2025
@@ -40,3 +40,52 @@ pub trait Iterator {
/// ```
fn next(ref mut self) -> Option<Self::Item>;
}

pub trait StorageIterator {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Introducing the StorageIterator brings serious concerns. I understand the necessity, because, currently, if a trait function/method is pure, as the Iterator::next is, then the implementation must also be pure, which storage access cannot satisfy.

We've discussed at the last offsite the side effects, like #[storage(read)], and how to properly bring them into the language and in particular into the type system. So far we do not have a conclusion, and IMO we need one before proceeding with the StorageIterator.

Essentially, the #[storage] side effect, as currently defined, introduces a strong coloring. If a trait can be implemented both with #[storage] and without, we need to have two traits defined, as in the example of the Iterator and the StorageIterator.

This causes all kind of issues down the road. In the concrete example of the Iterator, every common library that consumes Iterator will need to be duplicated for StorageIterator. Once we get closures, we will provide methods like map, filter, etc. But even now, without closures, we can provide methods like sum on the Iterator. All of those implementations will need to be duplicated to support StorageIterator.

Also the for loop. The fact that it works with the StorageIterator is IMO more of a luck and the internal compiler implementation then the design decision. Essentially, unlike Rust, which insist on implementing Iterator, Sway compiler searches for a next method on the type. I've played a bit with providing next methods on a type and ended up in ICEs when compiling for loops. But that's a separate topic. My point is, if we decide to follow Rust here, which I think we should, the StorageIterator will not work in loops anymore, or will need to be supported explicitely.

The question is how to proceed here. Since having iter() on StorageVec is more of a convenience, an easy workaround is to postpone it until we decide on the approach.

Another possibility is to redefine the rules around #[storage]. Means the way we treat side effects in the type system. E.g., even if a trait function/method is not marked as #[storage] an implementation can be. The downside is that a generic function that consumes trait method can be written as pure assuming pure implementation of a trait, but still produce an error that it needs to be marked as #[storage] because of a concrete trait implementation. Note that in the end, after the monomorphisation, we always know if the used trait implementation is pure or not.

This is just a first idea. The general approach requires more thinking. E.g., what about ABIs, etc. As said, we tackled this topic at the offsite, but didn't come to a conclusion.

My fear is, if we just continue with the coloring and duplication of traits, we are opting for an approach that by design leads to difficulties. Think of it in general. A trait with two pure methods. One implementation wants to have one of them non-pure. Another implementation of another type wants to have the other one non-pure. Etc.

@IGI-111 What is your thought on this?

Copy link
Member

@ironcev ironcev Jan 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here is one more reason for abandoning the iter() on StorageVec altogether, at least for now until we do not decide on the way we treat side effects in the language.

The assumption is, devs using iter() will want to iterate over the whole vector. Once we get the Storage v3.0 we will have storage specific traits like DeepReadStorage that will provide the most optimized way of accessing the whole stored content. We can even provide traits for optimized read of a continuous part of the content, etc.

My point is, once we have that, the most performant and convenient way to iterate over all the elements will be:

for elem in storage.some_storage_vec.deep_read().iter() { ... }

where the last iter() is just a regular Iterator::iter() over a slice.

Copy link
Member Author

@bitzoic bitzoic Jan 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While I don't disagree having two iterator traits is less than ideal, from a user perspective there will be no difference. There would be no required/breaking changes as the interface stays the same when fixed to the ideal solution later on. I would personally rather ship to users and deliver their needs/requests and come to a better solution once storage V3 is here.

This causes all kind of issues down the road. In the concrete example of the Iterator, every common library that consumes Iterator will need to be duplicated for StorageIterator.

As of right now, Iterator is only implemented for a single type, Vec, so there is not a lot of work involved in getting these to be on par. The trait is also only used for heap types which all have an associated storage type already.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There would be no required/breaking changes as the interface stays the same when fixed to the ideal solution later on.

This is IMO a good counterargument. My concerns are about the general design issue that will cause future problems for sure. We don't know how far that future is, but you are right, there should be no breaking change, or least not a serious one, if we introduce the parallel, colored, implementation now. This is under assumption that devs will not start implementing StorageIterator on their own storage types.

However, I would like to propose a solution that mitigates the issue of the parallel trait as well as the performance issues mentioned below. Please see the implementation in #6821 and let me know what you think. If you agree with that proposal I can add more tests, do the cleanup and finalize the PR.

Copy link
Member Author

@bitzoic bitzoic Jan 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm yes I like the solution in #6821 better.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps one more thought on #6821, even if we do check for current length, this only potentially hides a logic issue if someone iterates and changes the collection at the same time. IMO, we should stress that in the docs, that changing a collection while iterating is a logic error.

If you agree we can close this PR in favor of #6821 which I will then finalize.

type Item = StorageKey<V>;
#[storage(read)]
fn next(ref mut self) -> Option<Self::Item> {
if self.index >= self.values.len() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seriously increases the gas cost of an iteration, compared to a classical approach with the while loop and an index.

In the classical approach, the len() will be called, only once, before entering the loop. In this case, it is called for every iteration step.

I have strong concerns in providing a convenient way to iterate all elements, that can potentially double the gas cost of the iteration.

A simple solution can be to have a len field in the StorageVecIter and initialize it to the vector length in the iter(). This is of course fragile if the vector gets changed during the iteration. But changing the collection while iterating over it is a logical error anyhow. Unfortunately, unlike Rust, Sway cannot forbid it on the language level.

sway-lib-std/src/storage/storage_vec.sw Show resolved Hide resolved
@ironcev ironcev mentioned this pull request Jan 8, 2025
8 tasks
}

self.index += 1;
let key = self.values.get(self.index - 1);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To add to the performance considerations, every call to get() will again check the length. Which means every next() has two additional boundary checks.

The proposal to mitigate the boundary checks can be seen in #6821.

@bitzoic
Copy link
Member Author

bitzoic commented Jan 9, 2025

Closing in favor of #6821

@bitzoic bitzoic closed this Jan 9, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
lib: std Standard library
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Iter for StorageVec
2 participants