Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[experimental] adds pants_oxidized_experimental target, creating a standalone binary distribution for Pants #16484

Merged
merged 52 commits into from
Aug 29, 2022

Conversation

chrisjrn
Copy link
Contributor

@chrisjrn chrisjrn commented Aug 11, 2022

This introduces a pyoxidizer_binary target for Pants itself, allowing Pants to be built into a binary executable for distribution.

Currently the distribution is dynamically linked alongside the native engine, and a few other dependencies; I have not yet investigated static linking. When we investigate distribution of the oxidized binary, we'll need to see if we can make the pants bootstrap script use these binaries and still have them execute with the various libraries dynamically linked.

Addresses #7369


Updates since #16414:

  • Support for plugin resolution and loading
  • Support for loading in-repo custom backend packages
  • Re-instates the read_resource API that broke the release process (Prep for 2.14.0.dev5 #16434). This will need to be fixed before merging.

Christopher Neugebauer added 30 commits July 29, 2022 10:22
# Rust tests and lints will be skipped. Delete if not intended.
[ci skip-rust]
# Rust tests and lints will be skipped. Delete if not intended.
[ci skip-rust]
# Rust tests and lints will be skipped. Delete if not intended.
[ci skip-rust]
# Rust tests and lints will be skipped. Delete if not intended.
[ci skip-rust]
# Rust tests and lints will be skipped. Delete if not intended.
[ci skip-rust]
[ci skip-rust]

(cherry picked from commit 4c6c064)

[ci skip-build-wheels]
# Rust tests and lints will be skipped. Delete if not intended.
[ci skip-rust]
# Rust tests and lints will be skipped. Delete if not intended.
[ci skip-rust]
…nts distribution

# Rust tests and lints will be skipped. Delete if not intended.
[ci skip-rust]
# Rust tests and lints will be skipped. Delete if not intended.
[ci skip-rust]
…nts distribution

# Rust tests and lints will be skipped. Delete if not intended.
[ci skip-rust]

# Building wheels and fs_util will be skipped. Delete if not intended.
[ci skip-build-wheels]
# Rust tests and lints will be skipped. Delete if not intended.
[ci skip-rust]

# Building wheels and fs_util will be skipped. Delete if not intended.
[ci skip-build-wheels]
…RSION` resource is reliably available.

# Rust tests and lints will be skipped. Delete if not intended.
[ci skip-rust]

# Building wheels and fs_util will be skipped. Delete if not intended.
[ci skip-build-wheels]
# Rust tests and lints will be skipped. Delete if not intended.
[ci skip-rust]

# Building wheels and fs_util will be skipped. Delete if not intended.
[ci skip-build-wheels]
# Conflicts:
#	src/python/pants/BUILD
#	src/python/pants/backend/python/dependency_inference/BUILD
#	src/python/pants/backend/python/dependency_inference/scripts/BUILD
#	src/python/pants/version.py

# Building wheels and fs_util will be skipped. Delete if not intended.
[ci skip-build-wheels]
# Rust tests and lints will be skipped. Delete if not intended.
[ci skip-rust]

# Building wheels and fs_util will be skipped. Delete if not intended.
[ci skip-build-wheels]
….argv`.

# Rust tests and lints will be skipped. Delete if not intended.
[ci skip-rust]

# Building wheels and fs_util will be skipped. Delete if not intended.
[ci skip-build-wheels]
# Rust tests and lints will be skipped. Delete if not intended.
[ci skip-rust]

# Building wheels and fs_util will be skipped. Delete if not intended.
[ci skip-build-wheels]
# Rust tests and lints will be skipped. Delete if not intended.
[ci skip-rust]

# Building wheels and fs_util will be skipped. Delete if not intended.
[ci skip-build-wheels]
# Rust tests and lints will be skipped. Delete if not intended.
[ci skip-rust]

# Building wheels and fs_util will be skipped. Delete if not intended.
[ci skip-build-wheels]
# Rust tests and lints will be skipped. Delete if not intended.
[ci skip-rust]

# Building wheels and fs_util will be skipped. Delete if not intended.
[ci skip-build-wheels]
# Rust tests and lints will be skipped. Delete if not intended.
[ci skip-rust]

# Building wheels and fs_util will be skipped. Delete if not intended.
[ci skip-build-wheels]
# Rust tests and lints will be skipped. Delete if not intended.
[ci skip-rust]

# Building wheels and fs_util will be skipped. Delete if not intended.
[ci skip-build-wheels]
# Rust tests and lints will be skipped. Delete if not intended.
[ci skip-rust]

# Building wheels and fs_util will be skipped. Delete if not intended.
[ci skip-build-wheels]
Christopher Neugebauer added 10 commits August 11, 2022 14:24
# Rust tests and lints will be skipped. Delete if not intended.
[ci skip-rust]
# Rust tests and lints will be skipped. Delete if not intended.
[ci skip-rust]
# Rust tests and lints will be skipped. Delete if not intended.
[ci skip-rust]

# Building wheels and fs_util will be skipped. Delete if not intended.
[ci skip-build-wheels]

(cherry picked from commit 80dd94d)
# Rust tests and lints will be skipped. Delete if not intended.
[ci skip-rust]

(cherry picked from commit 438ade8)
# Rust tests and lints will be skipped. Delete if not intended.
[ci skip-rust]
# Conflicts:
#	src/python/pants/backend/python/dependency_inference/scripts/BUILD
#	src/python/pants/util/resources.py
# Rust tests and lints will be skipped. Delete if not intended.
[ci skip-rust]

# Building wheels and fs_util will be skipped. Delete if not intended.
[ci skip-build-wheels]
# Rust tests and lints will be skipped. Delete if not intended.
[ci skip-rust]

# Building wheels and fs_util will be skipped. Delete if not intended.
[ci skip-build-wheels]
# Rust tests and lints will be skipped. Delete if not intended.
[ci skip-rust]

# Building wheels and fs_util will be skipped. Delete if not intended.
[ci skip-build-wheels]
@@ -1,6 +1,6 @@
NOTES:

Run `./pants --no-pantsd --no-remote-cache-write package --pyoxidizer-args="--target-triple=aarch64-apple-darwin" --pyoxidizer-interpreter-constraints="['CPython==3.9.*']" src/python/pants/bin:pants_oxidized_experimental`
Run `./pants package --pyoxidizer-interpreter-constraints="['CPython==3.9.*']" src/python/pants/bin:pants_oxidized_experimental`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are the ICs only because you're on Apple Silicon, or it's to force the target to use 3.9? Should we set this in pants.toml?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At the moment, the pants scripts will compile native_engine against the Python SDK that is specified by Pants' global interpreter constrains. PyOxidizer will eventually package the Pants Python code along with a Python distribution.

At the moment, our interpreter constraints don't naturally align. If this were for anything other than packaging our own product, it might be worth fixing properly, but I suspect this will end up just being part of a release script.

Christopher Neugebauer added 2 commits August 26, 2022 10:28
# Rust tests and lints will be skipped. Delete if not intended.
[ci skip-rust]

# Building wheels and fs_util will be skipped. Delete if not intended.
[ci skip-build-wheels]
# Rust tests and lints will be skipped. Delete if not intended.
[ci skip-rust]

# Building wheels and fs_util will be skipped. Delete if not intended.
[ci skip-build-wheels]
@chrisjrn chrisjrn requested review from stuhood and jsirois August 26, 2022 17:30
@chrisjrn chrisjrn marked this pull request as ready for review August 26, 2022 17:31
@chrisjrn
Copy link
Contributor Author

Assigning @jsirois for his opinion on the utter shenanigans occurring in ox.py

@@ -11,12 +11,14 @@
def make_exe():
dist = default_python_distribution()
policy = dist.make_python_packaging_policy()
policy.extension_module_filter = "no-copyleft"
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is very important if static linking gets enabled -- static linking is the trigger clause that makes the copyleft features of the GPL get switched on.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like a reasonable default for this flag, given the potential landmine. @sureshjoshi: are you ok with:

  1. changing the default
  2. exposing a flag on pyoxidizer_binary to override it?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, no problem with that.

One heads up is that there is basically infinite configurability with this file, which is why I added this: https://www.pantsbuild.org/docs/reference-pyoxidizer_binary#codetemplatecode

Idea being, if the default "good enough" template wasn't enough, no one was perma-blocked - they could just create a custom .bzlt file and they're off to the races. I intentionally tried to use the PyOx defaults as much as possible for this default template... Principle of least surprise for incoming users.

So, I guess the question(s) is/are: what should the default template include, and is this use case is worth adding a specific flag or is it more of a "use the template" solution.

I won't even pretend to have the answer to that question. Nor do I feel strongly about it. 🤷🏽

Aside: Link to docs about this feature: https://pyoxidizer.readthedocs.io/en/stable/pyoxidizer_config_type_python_packaging_policy.html#starlark_pyoxidizer.PythonPackagingPolicy.extension_module_filter

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @sureshjoshi -- recognising that there is indeed a facility to override the entire template, I figured it was better to make the default template a safer default for people to use.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Coolest of the beans. As mentioned, no strong opinions on that here.

I would suggest as we're using a non-default setting, it may be worth a note in the help



def pex_main() -> bool:
"""Detect whether some external process is trying to invoke this binary as Pex.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIUC this is because Pants tries to resolve plugins using sys.executable right here:

@rule
async def resolve_plugins(
request: PluginsRequest, global_options: GlobalOptions
) -> ResolvedPluginDistributions:
"""This rule resolves plugins using a VenvPex, and exposes the absolute paths of their dists.
NB: This relies on the fact that PEX constructs venvs in a stable location (within the
`named_caches` directory), but consequently needs to disable the process cache: see the
ProcessCacheScope reference in the body.
"""
requirements = PexRequirements(
req_strings=sorted(global_options.plugins),
constraints_strings=(str(constraint) for constraint in request.constraints),
)
if not requirements:
return ResolvedPluginDistributions()
python: PythonExecutable | None = None
if not request.interpreter_constraints:
python = cast(
PythonExecutable,
PythonExecutable.fingerprinted(
sys.executable, ".".join(map(str, sys.version_info[:3])).encode("utf8")
),
)

The ICs are not present via this code path:

class PluginResolver:
"""Encapsulates the state of plugin loading for the given WorkingSet.
Plugin loading is inherently stateful, and so this class captures the state of the WorkingSet at
creation time, even though it will be mutated by each call to `PluginResolver.resolve`. This
makes the inputs to each `resolve(..)` call idempotent, even if the output is not.
"""
def __init__(
self,
scheduler: BootstrapScheduler,
interpreter_constraints: Optional[InterpreterConstraints] = None,
working_set: Optional[WorkingSet] = None,
) -> None:
self._scheduler = scheduler
self._working_set = working_set or global_working_set
self._request = PluginsRequest(
interpreter_constraints, tuple(dist.as_requirement() for dist in self._working_set)
)

self._plugin_resolver = PluginResolver(self._bootstrap_scheduler)

If all this is true, could you just have Pants set an env var or use some other direct signal to tell you here what's up? That would avoid divining tea leaves like this.

Copy link
Contributor

@jsirois jsirois Aug 26, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or just don't use sys.executable ever. Run Pex via a real Python until PyOxidizer supports being one of those. That's the IC path, so all the code is there to do this IIUC.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could set an env var for the initial invocation, but the issue at hand is the pile of invocations that are made by Pex in the plugin bootstrapping process (i.e. to run venv and pip itself), so we'd need to ensure that that specific environment variable is propagated. And we'd still need to look at argv to figure out which bootstrapping function to run.

It's worth understanding why we use sys.executable in the pre-pyoxidizer case: it's to make sure that plugins are configured with the same Python version and platform as being used to run Pants. We could set ICs and platform constraints by inspecting various things under sys, but in an environment where a Python interpreter is present, realistically, it should just end up finding the value of sys.executable, just with a giant pile more indirection.

Remember that the goal of this is to support running Pants without the need for a Python interpreter to be present. If we didn't use sys.executable and relied on the Python support elsewhere in Pants to load Pex:

  • we'd need to have a real Python present
  • it would need to be the same version as provided with PyOxidizer,
  • and it'd only be necessary for the purpose of downloading the plugins.
  • The external interpreter would never actually run the downloaded packages -- those would be loaded inside Pants

Basically, it would entirely defeat the purpose of not needing a Python interpreter

A better solution would be to not fetch plugins by pip at all, and then as long as we can unzip a wheel file into a reliable location, we wouldn't need to invoke Pex and venv at all. I'd be willing to talk about this use case for a future version of Pants; I think it might make a bit of sense.

Copy link
Contributor

@jsirois jsirois Aug 26, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yes, plugin Python needs to match our provided Python in the form of the pyoxidezed one. Gotcha.

This also could've been achieved by just having the wrapper script download a statically linked Python, which I believe the py03 family of projects provides. Then all the problems are solved - we have a real python interpreter to use, the user need not have one installed, etc. The loss is slightly more complexity in our pants wrapper script / setup, but IIUC that script still lives in this world anyhow.

A better solution would be to not fetch plugins by pip at all, and then as long as we can unzip a wheel file into a reliable location, we wouldn't need to invoke Pex and venv at all. I'd be willing to talk about this use case for a future version of Pants; I think it might make a bit of sense.

That sounds impossible. So would you just require any plugin has to provide some sort of manifest of its transitive dependency wheels, or provides all those wheels too or just outlaw 3rdparty deps for plugins? The latter seems the only workable thing in that world; otherwise Pants needs to reinvent conflict resolution between 3rdparty wheel deps.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we could switch to depending on Pip though and using runpy to execute it?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, the idea I'm proposing doesn't use pyoxidizer at all. It ships Pants as-is. The only difference is the wrapper script doesn't look for Python on the local system, it downloads a pre-compiled one with no Pants in it at all.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, the PyO3-distributed interpreter has the issue I just described

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aha, wow.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The custom interpreter is measurably faster, so it's good that we have it for the most part. It's just that it takes a bit more of a scorched-earth approach to __file__ than is absolutely necessary. 🤷

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW It actually looks like the pyoxy interpreter handles __file__ just fine and this is just using it stock and not customizing it with a yaml config to turn off the pyoxidized importer for the stdlib stuff:

jsirois@Gill-Windows:/mnt/c/Users/John Sirois/Downloads/pyoxy-0.2.0-x86_64-unknown-linux-gnu-cpython3.10$ cat package/module.py
import sys


print(__file__, file=sys.stderr)

try:
    print(sys.__file__, file=sys.stderr)
except AttributeError:
    print(f"{sys.meta_path}", file=sys.stderr)

jsirois@Gill-Windows:/mnt/c/Users/John Sirois/Downloads/pyoxy-0.2.0-x86_64-unknown-linux-gnu-cpython3.10$ ./pyoxy run-python -- -m package.module
/mnt/c/Users/John Sirois/Downloads/pyoxy-0.2.0-x86_64-unknown-linux-gnu-cpython3.10/package/module.py
[<oxidized_importer.OxidizedFinder object at 0x7f3e425a7be0>, <class '_frozen_importlib_external.PathFinder'>]

I'm realizing I can at least use this over in Pex to support Par I think.

Copy link
Contributor

@jsirois jsirois left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, all the Pex re-direct stuff looks like its probably fine? Tests pass and all, but I will say, even as a current Pex maintainer its hard to know for sure! It would definitely less scary to be using a real Python to resolve plugins if possible even if that makes things status quo brittle in that case for now (i.e.: the pants Python binary discovery rules still can fail in the ways they already fail if we were to go that route).

One thing that would help though if sticking with this tea-leaves route is to comment each case - namely what Pants rules cause that case to happen. FOr example, when does Pants trigger a -c, etc.

Copy link
Member

@stuhood stuhood left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! Thanks a lot.

@@ -11,12 +11,14 @@
def make_exe():
dist = default_python_distribution()
policy = dist.make_python_packaging_policy()
policy.extension_module_filter = "no-copyleft"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like a reasonable default for this flag, given the potential landmine. @sureshjoshi: are you ok with:

  1. changing the default
  2. exposing a flag on pyoxidizer_binary to override it?

src/python/pants/bin/pants_loader.py Show resolved Hide resolved

Run `./pants package --pyoxidizer-interpreter-constraints="['CPython==3.9.*']" src/python/pants/bin:pants_oxidized_experimental`

The binary will be `dist/src.python.pants.bin/pants_oxidized_experimental/aarch64-apple-darwin/debug/install/pants_oxidized_experimental` -- this will not work on the pants repo itself (yet?)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because it doesn't embed some of the backends?

You can sometimes get around that by disabling the backends, and skipping config verification:

--backend-packages='-["internal_plugins.test_lockfile_fixtures", "pants.backend.explorer"]' --no-verify-config

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There were things that didn't load early on in the process (that we may have since fixed); these notes haven't been hugely updated since I got things working initially. I'd still rather have a scratch file living in the repo rather than putting it in official docs (for now, anyway)

src/python/pants/init/plugin_resolver.py Outdated Show resolved Hide resolved
if len(sys.argv) < 2:
return False

if sys.argv[1] == "./pex":
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The "Detect whether some external process is trying to invoke this binary as Pex." bit seems like it might be a bit misleading. It's actually trying to use "this binary as Python", right? Because argv[0] is expected to be the python interpreter, and argv[1] is the argument to the interpreter.

Probably just a question of naming, but.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are all invocations that are Pants attempting to launch pex, or that are effected by pex attempting to bootstrap our plugins. The only machinery that is present is the machinery that supports pex (and then again, only for this extremely specific workflow), and I don't particularly want to give the impression that we're making a generic Python interpreter here.

Open to better names, but calling it "as Python" seems equally misleading

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(I changed the first line of the comment to strongly imply that we only support Pex use cases)

src/python/pants/ox.py Outdated Show resolved Hide resolved
# Rust tests and lints will be skipped. Delete if not intended.
[ci skip-rust]

# Building wheels and fs_util will be skipped. Delete if not intended.
[ci skip-build-wheels]
@chrisjrn chrisjrn merged commit cf75d1a into pantsbuild:main Aug 29, 2022
@chrisjrn chrisjrn deleted the chrisjrn/run-my-pants branch August 29, 2022 17:49
cczona pushed a commit to cczona/pants that referenced this pull request Sep 1, 2022
…standalone binary distribution for Pants (pantsbuild#16484)

This introduces a `pyoxidizer_binary` target for Pants itself, allowing Pants to be built into a binary executable for distribution.

Currently the distribution is dynamically linked alongside the native engine, and a few other dependencies; I have not yet investigated static linking. When we investigate distribution of the oxidized binary, we'll need to see if we can make the pants bootstrap script use these binaries and still have them execute with the various libraries dynamically linked.

Addresses pantsbuild#7369
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants