Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: Add type annotations to base spec #1676

Draft
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

edgarrmondragon
Copy link
Contributor

@edgarrmondragon edgarrmondragon commented Sep 18, 2024

@edgarrmondragon edgarrmondragon changed the title Add type annotations to base spec WIP: Add type annotations to base spec Sep 18, 2024
@martindurant
Copy link
Member

Thanks for giving this a start. Perhaps you already appreciate that this process will not be easy for the package, and spec might have been the hardest place to start (since all these methods are overridden in various duck ways in the backend implementations).

@scholtalbers scholtalbers mentioned this pull request Nov 7, 2024
@mon
Copy link

mon commented Nov 27, 2024

Preface: Just a random user.

This is a really good start (I love me some types), though I feel like it could do with some minor improvements

  • a few object annotations that could probably be narrowed to a Protocol
  • Inconsistent typing for inferred kwargs (explicitly specified recursive: bool = False vs implied on_error="omit")
  • open could do with an overload to differentiate binary and text streams

...however, I feel like it could just devolve into a bunch of minor nitpicks, whereas this already provides lots of value as-is.

What's the strategy towards a merge here? Grab a bunch of popular downstream consumers and ensure these don't cause conflicts with any overloaded duck typing happening there?

@martindurant
Copy link
Member

@mon : as with my initial comments, I am happy to see typing leak slowly into fsspec, but the duck-type design of the (old!) code means that it will be hard to have it everywhere and certainly very hard to get mypy happy throughout the codebase, including downstream implementations.

It gets complicated fast, because of issues like people demanding support for Path instances where originally only strings were allowed. The end-user API is based on convenience, but library developers using fsspec want strict definitions - requirements that are hard to simultaneously meet.

What's the strategy towards a merge here? Grab a bunch of popular downstream consumers and ensure these don't cause conflicts with any overloaded duck typing happening there?

It shouldn't be possible to break code with types alone, right?
However, I need people more experiences with type-ifying to review the code. It is not for my convenience, but for others, so it will not be high on my priority list.

@mon
Copy link

mon commented Dec 6, 2024

It shouldn't be possible to break code with types alone, right?

At runtime, there will be no effect, unless you type annotate forward/circular references. But these are found as soon as the module is imported, so shouldn't ever be a problem for you, only the person adding the types (and easily fixed by gating the import behind an if TYPE_CHECKING branch).

My worry was there might just be some downstream projects that, for example, extend the valid argument types for some functions, and making them specifically typed (instead of the default Unknown) will cause type errors from tools like Pylance. Personally I don't think that's an issue, and imperfect types is better than none.

@@ -381,7 +405,7 @@ def _ls_from_cache(self, path):
except KeyError:
pass

def walk(self, path, maxdepth=None, topdown=True, on_error="omit", **kwargs):
def walk(self, path: str, maxdepth=None, topdown=True, on_error="omit", **kwargs):
Copy link
Contributor

@Skylion007 Skylion007 Jan 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: on_error has only a few valid values, right? We should make it a Union of Type Literals (should probably consider setting that as a TypeVar).

return f"<File-like object {type(self.fs).__name__}, {self.path}>"

__repr__ = __str__

def __enter__(self):
def __enter__(self) -> "AbstractBufferedFile":
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

May want to consider typing_extensions.Self, if it's not already a project dependency, we should add as it's one of the top 10 most popular pypi downloads anyway.

@Skylion007
Copy link
Contributor

Nit: we may want to enable the relevent type checking rules in RUFF at some point.

@@ -606,7 +637,7 @@ def glob(self, path, maxdepth=None, **kwargs):
depth_double_stars = path[idx_double_stars:].count("/") + 1
depth = depth - depth_double_stars + maxdepth
else:
depth = None
depth = None # type: ignore[assignment]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Type depth above as an Optional and you won't need this type ignore.

@Skylion007
Copy link
Contributor

@mon : as with my initial comments, I am happy to see typing leak slowly into fsspec, but the duck-type design of the (old!) code means that it will be hard to have it everywhere and certainly very hard to get mypy happy throughout the codebase, including downstream implementations.

It gets complicated fast, because of issues like people demanding support for Path instances where originally only strings were allowed. The end-user API is based on convenience, but library developers using fsspec want strict definitions - requirements that are hard to simultaneously meet.

What's the strategy towards a merge here? Grab a bunch of popular downstream consumers and ensure these don't cause conflicts with any overloaded duck typing happening there?

It shouldn't be possible to break code with types alone, right? However, I need people more experiences with type-ifying to review the code. It is not for my convenience, but for others, so it will not be high on my priority list.

In Python 3.9 or newer, it should not.

@Skylion007
Copy link
Contributor

@martindurant I have a lot of typing experience and am happy to look at this PR if you would like.

@martindurant
Copy link
Member

@Skylion007 yes please, much appreciated!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants