Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor: split generic workflow from OS specific workflow #996

Closed
wants to merge 25 commits into from

Conversation

mayeut
Copy link
Member

@mayeut mayeut commented Jan 22, 2022

This PR started as an attempt to close #560 in the backlog.
As mentioned in the issue, there's probably little benefit in doing just this so I went further than that just to get some ideas where we could go. This is just a draft to see if any of those ideas are worth looking into, I don't expect this PR to be merged ever.

The end game idea was to split generic stuff from OS specific stuff which could help for #78, #676, #966, #971
To help reviewing/checking if anything is of interest:

  • Virtual Environments are now handled as context managers for either build or tests
  • There's now a PlatformBackend abstract class which should be implemented for all that's related to call/shell/file operations. This allows to have Windows/macOS to implement a "native/direct" one, linux the DockerContainer one.
  • There's now a BuilderBackend class which handles the default build workflow thus avoiding code duplication for generic stuff. It can be customized to handle OS specific workarounds, specific implementations.

As a whole:
Pros:

  • Better workflow consistency between all OS (and we could enforce the same constraints on linux as on windows/macOS easily for even more consistency) with less code duplication.
  • Allows easier extendability through abstract class (one might even suggest a plug-in system as an end goal)

Cons:

  • Higher complexity
  • Harder to see the order of specific patch/workarounds/specifics at OS level

Copy link
Contributor

@henryiii henryiii left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did a quick first pass. In general, I like this direction as long as it's easy to unit test (mocking and such should be easy?). I'll need to do a second pass looking not at the diff, but at the full code, to see if it is still reasonable to track what is happening as a reader. Also to see why there's a backend and a platform backend.

I'm not sure if @joerick will like it, given https://cibuildwheel.readthedocs.io/en/stable/contributing/#design-goals, but since it's this far, I think a "final version" should be compared with the current version (probably not through a diff) to see if it looks better, to see if it looks easy to make large specific changes without breaking the other platforms, etc. (PS: to be clear, I'm not claiming to like it yet either, just positive about the direction but will need further review later.)

from .virtualenv import VirtualEnv, VirtualEnvBase


class BuilderBackend:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this would be a lot simpler as a dataclass.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will need some time to think about what it would look like going this way.

return wheels


def test_one(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Starting with test_ makes this look like a pytest test.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changed to do_test_wheel

from cibuildwheel.util import call, ensure_virtualenv, shell


class PlatformBackend:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this be an ABC? (either through the ABCMeta metaclass or inheriting from abc.ABC)?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done.

@henryiii
Copy link
Contributor

PS: Also, regardless of the final comparison, there are some good changes here we should pull out, like more Final usage.

@henryiii
Copy link
Contributor

By the way, I've thought about this too, and what I likely would have attempted would have been to implement smaller methods and a Protocol that would verify that a backend had the required components. Protocols are generally nicer than ABC's, though it's easier to share code with an ABC (which is not always great for reading the code, which is part of why Protocols are nice...).

@joerick
Copy link
Contributor

joerick commented Jan 27, 2022

just a quick note - i'm very curious to review this but I won't have time for a few days! Apologies

@joerick
Copy link
Contributor

joerick commented Feb 4, 2022

Thanks for this @mayeut, it's interesting! Reflections on a quick 30-min skim through...

  • Reading the code itself (outside of the diff), I get a bit lost. For example, it's not very clear to me where the actual building takes place. The existing codebase, you can go to the build function of the platform you're interested in and read it from top-to-bottom to understand what happens. I like that - especially in the context of building things, a 'build script' style of code is more suitable I think. In this PR, we lose this straight-forwardness. Of course, the existing codebase pays for that in some code duplication, but I think readability is important enough to warrant this.
  • I'm surprised how 'big' BuilderBackend is. It's got a lot of functionality in there, so perhaps suffers from the age-old problems of inheritance - where execution flow gets pretty confusing. Perhaps there would be a different version of this which tried to make the platform-specific bits as a Protocol, with as small a surface-area as possible. This might be nicer. But I'm not sure how small that surface area could get...
  • Maybe it's possible to retain the build-script style, but with state-less (or minimally stated) utilities for the platform-specific bits via a Protocol? I've thought about this a little... possibly yes, but I guess we'd have to have loads of if platform == 'windows': and if platform == 'macos': for the various workarounds. I'm not sure if this would be more or less readable... potentially it could be more of a mess than the existing code.
  • I know that Lines of Code is not a great benchmark for a number of reasons, but I am surprised that this increases the count, given the duplication that was eliminated. I suppose we pay for it in genericness.

@mayeut
Copy link
Member Author

mayeut commented Feb 5, 2022

@joerick, @henryiii, thanks for taking the time to review this.

I'll try to address the comments but not sure when I'll be able to address all of them.
In the meantime, if there are bits that you think should be extracted from this on their own, please let me know (as @henryiii did for Final usage).

@mayeut
Copy link
Member Author

mayeut commented Feb 5, 2022

I rebased on main. Only the last 3 commits are addressing some of the concerns raised during review.

@mayeut
Copy link
Member Author

mayeut commented Feb 5, 2022

The existing codebase, you can go to the build function of the platform you're interested in and read it from top-to-bottom to understand what happens. I like that - especially in the context of building things, a 'build script' style of code is more suitable I think. In this PR, we lose this straight-forwardness. Of course, the existing codebase pays for that in some code duplication, but I think readability is important enough to warrant this.

@joerick, I probably went too far in BuilderBackend: as I was doing _setup_build_env, I thought it might be good for other steps to be functions as well. I removed functions for those other steps and expanded them in BuilderBackend::run. It still does not answer your concern but is probably better now than before in this regard.

@joerick
Copy link
Contributor

joerick commented Apr 29, 2022

There are some good ideas here, I think it was worth attempting for sure, but ultimately I don't think that the extra layers of abstraction give us enough upside in maintainability to outweigh the downsides of indirectness.

I'm cleaning up PRs, so I'll close this one for now. Ofc feel free to reopen if you'd like to continue discussion.

@joerick joerick closed this Apr 29, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Factor out run_on_each
3 participants