Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add trusted publisher release workfiow #13048

Open
wants to merge 11 commits into
base: main
Choose a base branch
from

Conversation

sbidoul
Copy link
Member

@sbidoul sbidoul commented Oct 27, 2024

closes #12708

@ichard26
Copy link
Member

ichard26 commented Nov 5, 2024

It'd be neat to also have the workflow cut a GitHub Release automatically: #12169. Of course, this can be done later.

@sbidoul
Copy link
Member Author

sbidoul commented Nov 6, 2024

@ichard26 I'd prefer to handle GitHub releases as a separate work stream indeed.

.github/workflows/release.yml Outdated Show resolved Hide resolved
pypi:
name: upload release to PyPI
runs-on: ubuntu-latest
environment: release
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we going to include any deployment protection rules for the release environment? I realize that means requiring a second person to briefly review any release deployments, but that seems prudent given the rise of supply-chain attacks. For example, if I were to be a maintainer, I don't think I'd need the ability to cut a release completely solo.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What would we ask a second reviewer to verify before approving the release?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The steps in the release process that this replaces are all ones that happen locally. So there's nothing visible that a second reviewer could check. It's still perfectly fine for someone doing a release to stop just before pushing the tag and asking for a review of the commit that will become the release if they feel the need to do so.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was more so suggesting that we require a second pair of eyes for any release as a defence-in-depth mechanism against account takeover and other supply-chain attacks. Realistically, we would be compromised in other ways anyway and it isn't worth the hassle.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, surprisingly I can't mark my own review comment as resolved? Either way, I have no further comments.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Disclaimer: I've spent an hour writing this. I tried to present a logically consistent argument, but if it starts to run off course, you know why.

I'm somewhat uncomfortable relying on PyPA owner status for admin access to pip. I'd much rather we had a clear and well-defined set of security rules and levels within the project.

Fair enough. I brought them up as even if we introduced stricter protection rules and dropped the base pip committer permission, they would still retain full administrator permissions. Thus, they would logically be the starting point for deciding who would be in the "project owner" group. We definitely don't have to tie project ownership to PyPA ownership.

And yeah, GitHub's permission model for organisation repositories is not exactly flexible. I chose Maintain simply as it's the highest access role w/o administrator (i.e. bypass everything) permissions.

I'm not convinced we need the distinction between committers and release managers - until now we've managed just fine with all committers also being potential release managers, and I don't think we're big enough as a team to make it worth separating the roles.

Agreed.

I'm also wary of a distinct "project owner" status, as it goes somewhat contrary to the "team effort" picture I have of pip, to have privileged "owners".

I don't think having granular committer/owner roles is antithetical to the "team effort" approach to pip. When I used to maintain Black, I only retained committer access to the project, but I still felt fully able to contribute.1 Actually, I used to be effectively the lead maintainer when I had so much free time. Sure, technically I couldn't do as much as the "project owners", but 99% of what I did only required committer access anyway. More importantly, we still treated each other's opinions and suggestions from a level playing field. Yes, there was a technical imbalance, but socially, no.

I suppose this is a difference of perspective, but I consider the division of committer access as a reasonable compromise between ease of maintenance and security. If everyone has administrator access, then, yes, when we need to update a sensitive setting (say to land #13107), it's easier as we don't have to bother a specific person (or group of people). OTOH, it's rare when we actually need do something that requires administrator access (editing protection rules, GHA settings, webhooks, etc). In the vast majority of time where we don't need admin access, those accounts represent a larger than necessary security risk (if they were to be compromised somehow). The small additional friction to do X thing is IMO outweighed by the security benefits.

So I guess what I'm saying is that I don't see the point in splitting (2), (3) and (4), except conceptually. Certainly when I'm asked if someone should be a committer, I assume that includes the potential to be a RM and an admin on the project.

I trust everyone else on the pip committer team to be a RM and administrator on the project as well. If a pip committer needed to be an administrator for a legitimate reason, I have no objections to extending that permission to them. I also trust everyone to take proper measures to secure their accounts. Adding "security levels" doesn't change that for me. The problem is that mistakes happen. Access tokens get mistakenly leaked, credentials get phished or bruteforced, and a variety of other creative attacks occur. We should be prepared for that.

I'm happy to require reviews if people think that's necessary. I feel that it might introduce yet more delays into what is already a frustratingly1 slow process

I'm actually not entirely in favour of requiring reviews for similar reasons. I do think it's worth it to drop the base permission and require reviews for releases as they're infrequent events.

Of course, if we didn't have to worry about security then none of this would need to be discussed, but for better or worse, we don't live in such a world anymore :( Also, I haven't been a RM before (and I don't have any plans to be one soon, for lack of time). If you, one of our regular RMs, find these suggestions to be too onerous, then I'm fine with maintaining the status quo.

Footnotes

  1. There was a time where only Łukasz had admin permissions, which was annoying as we occasionally needed him to do something when he wasn't available, but that was addressed once an active maintainer (not me) was given admin permissions.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose this is a difference of perspective, but I consider the division of committer access as a reasonable compromise between ease of maintenance and security.

I don't disagree with that. I think part of my "hidden agenda" here is that if we're going to have distinct project owners, then I'd prefer not to be one, simply because I don't want to be a bottleneck on actions that require an admin1. I also don't want to be characterised as "owning" pip, because I'm very conscious of our shortcomings as a project, and I could do without feeling more responsible for that. But I do want to remain a PyPA owner, as I feel that I have a useful role in that context. That's all very muddled, and not really actionable, but not having an "owner vs committer" distinction brushes the problem under the carpet, which works for me 🙂

If you, one of our regular RMs, find these suggestions to be too onerous, then I'm fine with maintaining the status quo.

I think I might. My usual release workflow is that I set aside a Saturday morning for doing the release. I'll manage all the outstanding PRs and milestones in the week or two before the release, then on the day I'll go through the release process and I'm done. I keep track of the release in my head, which is fine as it's a single piece of work with no interruptions. Adding a review to that process would introduce a delay where I'd need someone else to be available, and approve the release - there's no guarantee that would happen in the sort of timescale (an hour or so max) that I'm working to.

So I think I'd need to see the Release Process section of the docs split into two parts in order to incorporate a review. Do part 1, request a review, then do part 2. And I wouldn't be able to guarantee when I could assign time for part 2 in advance, because I don't know when I'll get an approval. So the git repo needs to be (at least in some sense) frozen between parts 1 and 2, which could be days apart.

Am I overthinking this? It feels like requiring a review during the release process necessitates splitting the process into two parts like I describe above, which is bad for scheduling. But maybe I've misunderstood how getting a review would work in practice?

Of course, if we didn't have to worry about security then none of this would need to be discussed, but for better or worse, we don't live in such a world anymore :(

Agreed. But conversely, admin around security is not what I want to spend my volunteer open source time on, so keeping things streamlined is important to me.

Footnotes

  1. And I know that not having an extra admin is more of a bottleneck than me being one but not being available sometimes, but for me, I don't want to feel responsible for keeping on top of "things that need admin access".

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW, I've created a release environment for this as well.

@pradyunsg
Note: You don't actually have to pre-create it unless you're going to configure it. They are auto-created.
Though, in another comment, I tried explaining why it shouldn't be called after a process due to the semantics. And I'll note that somebody should make sure to duplicate that name on PyPI.

@ichard26 hint: it is possible to disallow self-approvals in the environment protections, FYI. Also, it's possible to add up to 6 entries into the required reviewers list there. Not only users, but teams — you can let more people approve if you use teams.

Although, I tend to enable required reviews even on projects where I'm the only committer. This allows me to have more control over the process, and this pauses the workflow run just before starting the job. So when the build job produces the wheels, I could even download them locally, and inspect if I wanted to.
This is another reason for splitting the job into separate security scopes with lower permissions for the build one.

@pfmoore similarly, to address the “delay” concern — setting up required reviews with just 1 reviewer required and self-reviews allowed would let you have just enough control over the last action which is immutable (the actual PyPI upload) w/o contributing to any delay meaningfully. This is what I tend to configure.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would like to come back to my initial question: What would we ask a second reviewer to verify before approving the release?

When we setup a review process, I think it is important to explain what is expected from the reviewer. When reviewing or merging a PR, this is implicit but I think generally accepted: the person who approves or merges communicates that they have looked at, and agree with the change, unless explicitly mentioned otherwise.

Currently, as a RM preparing a release, I do not re-review nor look at everything that was merged in the last quarter to assure there is no malicious code that was merged on main. In effect I assume that the review process was effective and guarded against malicious intents.

If we introduce an approval step in the release process, what would the reviewer need to do in that step? Downloading the built wheel and sdist and inspecting them does certainly not looks like something that would be very practical to me. This is a genuine question.

So if we want to guard against compromise of a maintainer GitHub account, I'd think a second approval on release is good, but only effective if we also protect main and strictly require a second review (no self review) before every merge.

As a side note, I also think it would somewhat complicate the release process which I usually do when I have time on a weekend, and not currently coordinating with availability of another maintainer. But I'll adapt if we reach the conclusion that this is important, of course.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. My RM process is very similar. And in particular, the assumption that what is merged on main is correct, valid and ready for release. If we were to require the RM to (in effect) validate all code that went into the release I'm pretty certain I wouldn't have the time to be a RM.

@sbidoul
Copy link
Member Author

sbidoul commented Nov 19, 2024

@pfmoore @pradyunsg as regular release managers, what do you think about letting a GitHub action doing the publishing to PyPI using trusted publishers?

@pfmoore
Copy link
Member

pfmoore commented Nov 19, 2024

+1 from me.

@sbidoul sbidoul added this to the 25.0 milestone Nov 19, 2024
@ichard26 ichard26 mentioned this pull request Dec 7, 2024
Copy link
Member

@webknjaz webknjaz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👋 hey, so with my pypi-publish maintainer hat on, I figured I'd point out a few places that are considered discouraged/dangerous. Plus, there are a few suggestions that are not strictly security-related. JFYI.

.github/workflows/release.yml Outdated Show resolved Hide resolved
.github/workflows/release.yml Outdated Show resolved Hide resolved
pypi:
name: upload release to PyPI
runs-on: ubuntu-latest
environment: release
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW, I've created a release environment for this as well.

@pradyunsg
Note: You don't actually have to pre-create it unless you're going to configure it. They are auto-created.
Though, in another comment, I tried explaining why it shouldn't be called after a process due to the semantics. And I'll note that somebody should make sure to duplicate that name on PyPI.

@ichard26 hint: it is possible to disallow self-approvals in the environment protections, FYI. Also, it's possible to add up to 6 entries into the required reviewers list there. Not only users, but teams — you can let more people approve if you use teams.

Although, I tend to enable required reviews even on projects where I'm the only committer. This allows me to have more control over the process, and this pauses the workflow run just before starting the job. So when the build job produces the wheels, I could even download them locally, and inspect if I wanted to.
This is another reason for splitting the job into separate security scopes with lower permissions for the build one.

@pfmoore similarly, to address the “delay” concern — setting up required reviews with just 1 reviewer required and self-reviews allowed would let you have just enough control over the last action which is immutable (the actual PyPI upload) w/o contributing to any delay meaningfully. This is what I tend to configure.

@sbidoul
Copy link
Member Author

sbidoul commented Dec 10, 2024

@webknjaz thanks! I have applied your recommendations.

@sbidoul sbidoul requested a review from webknjaz December 10, 2024 10:47
# Used to authenticate to PyPI via OIDC.
id-token: write
steps:
- uses: actions/checkout@v4
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pin all the action steps to commit SHAs instead of git tags to avoid a source of immutability. You can use frizbee to do this for you if you'd like.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's also https://github.com/davidism/gha-update. And Dependabot knows to update the hashes too (also bumping the human-readable tag in a comment on the same line).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I pinned the actions using frizbee.

Copy link
Member

@webknjaz webknjaz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sbidoul I looked closer into other bits of the patch and noticed a few more things that I'd rather change/keep before merging.

.github/workflows/ci.yml Outdated Show resolved Hide resolved
.github/workflows/release.yml Outdated Show resolved Hide resolved
.github/workflows/release.yml Outdated Show resolved Hide resolved
on:
push:
tags:
- "*"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is unnecessary, it's the same by default:

Suggested change
- "*"

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When I remove this line, vscode complains. I could put an empty sequence but I'm not sure it is easier to read.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure that an empty sequence has the same semantics. I think null or ~ might be equivalent, though.

noxfile.py Outdated
@@ -315,94 +314,3 @@ def prepare_release(session: nox.Session) -> None:
next_dev_version = release.get_next_development_version(version)
release.update_version_file(next_dev_version, VERSION_FILE)
release.commit_file(session, VERSION_FILE, message="Bump for development")


@nox.session(name="build-release")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we keep this env so nox is still in the center of the processes and the manual ways remain in the repo? Plus, consider implementing pinning of the build env on this level per my other comments.

@sbidoul sbidoul mentioned this pull request Dec 11, 2024
@sbidoul sbidoul force-pushed the trusted-publisher-sbi branch 5 times, most recently from 604154c to cd82ee7 Compare January 12, 2025 15:05
@sbidoul
Copy link
Member Author

sbidoul commented Jan 12, 2025

Here is an updated version.

I pinned build dependencies, and use them in an dedicated build environment, with build --no-isolation.

I pinned release GitHub actions (using frizbee).

I chose to not use the setup-python action, since we have python in the GitHub runner, and it is therefore one less thing to audit.

In a followup I plan to update the nox build action to use a similar build process with the pinned build deps. I chose however to not use nox in the release process to avoid having to pin nox dependencies because there are too many of them and I feel that would make auditing the build environment harder.

Is this reasoning of limiting the number of dependencies used in the build process in order to facilitate audit valid? One questioning I have is about the auditability of the GitHub runner used for the build. The job logs gives a link to the runner image release. But what if the jobs log is lost? Is that recorded somewhere else?

@notatallshaw
Copy link
Member

notatallshaw commented Jan 12, 2025

Is this reasoning of limiting the number of dependencies used in the build process in order to facilitate audit valid? One questioning I have is about the auditability of the GitHub runner used for the build. The job logs gives a link to the runner image release. But what if the jobs log is lost? Is that recorded somewhere else?

I can't speak for any "official" audit processes, but when I look at validating artifacts myself I'm looking a minimum of:

  • Are there version controlled build steps?
  • Are there sufficiently pinned dependencies?
  • When I run the build steps with the pinned dependencies do they produce binary identical artifacts? (wheel and sdist in this case)

From that point of view I'm supportive of minimizing dependencies, it's less chance of things going wrong.

Beyond that, is pinning down the non-Python package dependencies an explicit goal of this PR? If so I would recommend using a pinned Python docker image to run the build processes with Python always called using isolated mode.

@sbidoul
Copy link
Member Author

sbidoul commented Jan 12, 2025

Beyond that, is pinning down the non-Python package dependencies an explicit goal of this PR?

Not a goal of mine, at least.

If so I would recommend using a pinned Python docker image to run the build processes with Python always called using isolated mode.

But that would add one more moving piece to the game.

So I think I'm happy with this PR as it is. My question about the auditability of the GitHub runner is more curiosity than anything I want or think we should address.

# This file is autogenerated by pip-compile with Python 3.12
# by the following command:
#
# pip-compile --allow-unsafe --generate-hashes build-requirements.in
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Side note: these options can be set in a config file that pip-tools support. Consider doing so in a follow-up if more of pip-tools will eventually end up being used.

@webknjaz
Copy link
Member

I think the CI logs live for about 3 months before garbage collection. The artifacts, I think, also live for the same amount of time. In CI/CD workflows where the process is shared between release and testing, I tend to conditionally set the retention time to 90 days, which is usually max.

To make the dists reproducible, you have to set the $SOURCE_DATE_EPOCH env var which most build backends (including setuptools) would recognize. People typically use the timestamp that Git's HEAD is pointing to at the time of building: https://github.com/ansible/awx-plugins/blob/c8cff62/.github/workflows/ci-cd.yml#L543C15-L543C59.

This would have to be duplicated in noxfile.py, though, to ensure that building locally (which might be needed in cases of emergency or for verification purposes) is the same. This is kinda the main reason I wanted it to be reused in the workflow. If you don't want to have this in noxfile.py directly, it might make sense to put it into a Python script that both would call. Might be an idea for a follow-up.

Using the Python Docker image would indeed improve reproducibility at the cost of delegating reviewing that image to somebody else (provided that it's pinned using SHA).

That said, I don't see any serious blockers here. If you're happy with the PR, I'd say — merge it and think about other things in a follow-up.

@sbidoul sbidoul requested a review from sethmlarson January 14, 2025 14:31
name: python-package-distributions
path: dist/
- name: Publish distribution 📦 to PyPI
uses: pypa/gh-action-pypi-publish@67339c736fd9354cd4f8cb0b744f2b82a74b5c70 # release/v1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just realized that I missed the comment here...

release/v1 is a branch, i.e. a moving pointer to commits. If you want to implement pinning, do that through regular tags. I'm pretty sure dependabot will get confused here, when attempting to bump, it will likely be unable to deduce the previous version pinned or update the comment.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've never used dependabot for pinned GitHub actions, so I don't know.

But which tag, then ? As I understand other actions have a "stable" tag that moves with each release. For instance, at the moment, action/checkout tags v4 and v4.2.2 both point to the same commit. Should there be some similar mechanism for pypa/gh-action-pypi-publish ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ugh, I really don't like having commit IDs like this - it feels annoyingly unmaintainable. I guess, if it's the recommended approach for securely using code from github URLs, then we have to do it, but I can't lie, I think it sucks.

If there's another, sufficiently secure, way of doing this that uses proper symbolic names, I'd far rather we used that. But if not, I'll live with this (I'll just ignore it and let someone else deal with keeping it up to date).

Copy link
Member Author

@sbidoul sbidoul Jan 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was following the recommendation from @sethmlarson, who may confirm it's the best approach.

I personally don't find the hashes in the GitHub action worse than those in build-requirements.txt.

But, the worry I have is with reviewing a PR that would update those hashes. I would not know how to check efficiently that such a PR is not malicious, short of running the update locally myself and comparing, which at that point would be more work than doing the update myself. So my hope is that we can trust dependabot to do the update PRs (both for the GitHub action and build-requirements.txt), but if that does not work then I'm not sure I will be comfortable with the hashes.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hashes in requirement files supplement the version number. A mismatch is an error. In this case, the version is a comment, and there's no check that the commit ID and the version match. Dependabot might well cover this, but I'm not comfortable relying solely on an external tool for security here.

Copy link
Member

@ichard26 ichard26 Jan 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While that's a neat feature, unless the update is small, we are not equipped to review large diffs of JS projects, especially if they check in their built JS code. I think what @sethmlarson was implying that GitHub itself would check that any SHAs are linked to the versions they purport to be for edits to GHA workflows. That's a pipe dream (and maybe infeasible for other reasons, too).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I also dislike having to stick hashes everywhere (from the UX/DX perspective). While I use the branch in all the examples in the README and the PyPUG guide, I think I left a few hints there that using SHAs is more secure in general.

Dependabot does infer the tags from SHAs it's auto-bumping. In that sense, it's already on GH not to mess it up. I'm not super happy with Dependabot on a few things (like it'd assume SemVer / tag semantics and would be rather noisy for my taste). But in general, this is what people use. Having zizmor could be a good middle ground for detecting mismatches between the SHAs and the tags.

On one had, you could also decide to trust that I won't break the moving pointer branch and just keep that. But this would not be setting a great example. Plus, it's still a good idea to be in control of reproducibility regardless. But with that, you'd go for bumping the tags already. And since the tool that does the bumping can update both tags (in comments) and hashes, it's probably fine to delegate that to it.

Note that Dependabot also tries to extract the commit list and change logs of things it's offering to update. They are usually collapsed but are available to read before merging its PRs. I personally set the updates to be as rare as possible, which is especially noticeable when maintaining multiple projects.

One possible future solution to this might be a maintained reusable workflow. I have something in my mind already, but unfortunately, trusted publishing does not yet support these. But in general, provided that there'd be reusable workflows with everything pinned, they could reduce the number of places where those pins would need to be updated/checked. That's a separate discussion topic, though.

Copy link
Contributor

@potiuk potiuk Jan 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For checking the validity of what's been done, either CodeQL or Zizmor are the way to go, even if its done locally during development of the action.

Oh absolutely - we use CodeQL in CI and Zizmor in pre-commits and that combo works great.

Is this are you expecting? if not yeah absolutely we can contribute to dependaboat what is required :)

Yep. I think the easier you make to take a look at the action change/diff the better

While that's a neat feature, unless the update is small, we are not equipped to review large diffs of JS projects,

I think (and this is what we did) is not a complete review, more like "eyeball check" if there is nothing suspicious. If you blindly bump hash commit without any kind of verification of what's coming in it (and no tooling does it really now) - hash commit give you false sense of security because you are just slowing the rate of updates. Simply not reviewing what's coming at least to get a sense of "yeah, looks legit" is as insecure as having a moving tag.

Without actually reviewing or "knowing" what the the new tag has (when for example you contributed your own code) - using hash commits is mostly a "cargo cult".

Copy link
Member Author

@sbidoul sbidoul Jan 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks everyone.

My takeaway is that the PR is good as is, and we'll see how dependabot behaves.

Given the difficulty of actually verifying the hashes, we are going to trust dependabot to not be a bad actor, and assume it proposes hashes that actually correspond to the branch/tag it writes in comment.

If we merge the dependabot PRs at the beginning of the quarterly pip release cycle, it leaves 2-3 months for a possible compromise in a dependency to be detected before it actually causes harm in a pip release.

Compared to the process before this PR, the reproducibility of the build is improved.

Regarding security, I see it as essentially moving the trust from the integrity of the maintainer machine to the integrity of the GitHub runner, dependabot and 4 GitHub actions.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regarding security, I see it as essentially moving the trust from the integrity of the maintainer machine to the integrity of the GitHub runner, dependabot and 4 GitHub actions.

Agreed.

To summarise my position on this (and to support the "we won't review diffs from action updates" position), I take the view that we build our release workflow around "release X of action Y". We trust the authors of action Y to have delivered the functionality they claim in release X, and to have ensured that what they publish as release X actually is release X. As such, we care about the release number, and translating that into a commit ID (either to reference in our workflow, or to check the ID that automated tool like Dependabot used) is error-prone busywork.

But to be clear, if our position is now that we trust the runner, the actions (and their authors) and dependabot, then that's fine with me, as it means we can validly accept dependabot commits (because we trust dependabot) without worrying about the commit IDs, and just basing the decision on what the new version release notes say.

],
check=True,
)
subprocess.run(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

May I suggest invoking check_call() instead?

Copy link
Member Author

@sbidoul sbidoul Jan 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well that's what I used to use, but I kind of understood check_call and check_output where considered old style, so, I was trying to be a good boy and obey the Python documentation :)

image

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh... I think it's talking about run() vs. Popen() there. The former is newer and higher level, while the latter is low-level and more difficult to work with (it's still used under the hood, though). I'd say Popen() is old-style in this case.

check_call() and check_output() are just convenience shortcuts for two typical uses of run(), AFAIU.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be honest, over time I came to like using run over check_call and check_output. I find it equally easy to read and easier to teach and remember.

news/13048.process.rst Outdated Show resolved Hide resolved
@ichard26 ichard26 self-requested a review January 15, 2025 02:37
@sbidoul sbidoul force-pushed the trusted-publisher-sbi branch from b4e31a7 to 9ed1d30 Compare January 15, 2025 18:08
Copy link
Member

@ichard26 ichard26 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ignoring the conversations about security and access management (which I'm happy to defer to a separate issue for another time), this looks pretty good.

Two general comments:

I don't like polluting the root directory with a bunch more files. However, I can't think of a better place to put them as they are pretty essential for any non-contributor consumers...

Also, The source distributions are NOT reproducible. The gzip header includes a timestamp and setuptools doesn't use SOURCE_DATE_EPOCH to force a specific time (pypa/setuptools#2133). The wheels seem to be reproducible, although I haven't tested it across platforms yet.

Thank you @sbidoul for driving this forward!

Comment on lines +10 to +33
def get_git_head_timestamp() -> str:
return subprocess.run(
[
"git",
"log",
"-1",
"--pretty=format:%ct",
],
text=True,
stdout=subprocess.PIPE,
).stdout.strip()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know that I literally used to maintain Black, but I never liked how Black splits long lists into sprawling messes like this. I would be in favour of using # fmt: off/on (or # fmt: skip) here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That would make for at least one a very long line, and I chose to have the calls presented in a uniform way across the file. I personally find the fmt comments would make things worse. A matter of taste, I guess.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ugh, this file is pretty ugly IMO. If it were a nox session, it'd be so simple. However, nox does introduce its own non-insignificant attack surface, so I guess this is the price we pay.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes we don't want to have to pin nox and all it's dependencies. This is invoked with nox -s build-release, though.

build-project.py Outdated
],
check=True,
)
build_python = Path(build_env) / "bin" / "python"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This won't work on Windows. I know this is primarily meant for GHA, but it'd be good for our release flow to work on everyone's machine when needed.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. I updated the script for portability. Interestingly the venv module does not seem to expose the paths it computed so I had to resort to a little hack, subclassing EnvBuilder.

@ichard26
Copy link
Member

I've also set up a new PyPI GitHub Environment to match the new deployment name in the workflow. Deployments still need a review, but self-reviews are allowed. The review step is just meant to be a "confirm" button.

@sbidoul sbidoul force-pushed the trusted-publisher-sbi branch from 9ed1d30 to e0b2f25 Compare January 19, 2025 10:25
@sbidoul sbidoul force-pushed the trusted-publisher-sbi branch from e0b2f25 to f06bd2d Compare January 19, 2025 10:26
@sbidoul sbidoul force-pushed the trusted-publisher-sbi branch from a06d4b7 to 1801d83 Compare January 19, 2025 11:48
@sbidoul
Copy link
Member Author

sbidoul commented Jan 19, 2025

I've also set up a new PyPI GitHub Environment to match the new deployment name in the workflow. Deployments still need a review, but self-reviews are allowed. The review step is just meant to be a "confirm" button.

Ok. I'll see how that works when preparing the release and update the release process docs then.

@sbidoul
Copy link
Member Author

sbidoul commented Jan 19, 2025

I don't like polluting the root directory with a bunch more files. However, I can't think of a better place to put them as they are pretty essential for any non-contributor consumers...

I also feel the whole thing is somewhat clunky but I can't think of anything better either.

If no further simplification is possible and this PR indeed represents the state of the art for portable, reproducible, and secure build and publishing, this may hint at future work streams for the packaging community.

@sbidoul
Copy link
Member Author

sbidoul commented Jan 19, 2025

I configured PyPI like this

image

@@ -0,0 +1,3 @@
Started releasing to PyPI from a GitHub Actions CI/CD workflow that implements trusted publishing and bundles :pep:`740` digital attestations.

In addition to being signed, the released distribution packages are now reproducible through the commit timestamp.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I should've flagged this directly. This is currently inaccurate due to the setuptools bug I mentioned. I think it may be better to simply drop this until we confirm with setuptools that they're reproducible-ready.

@webknjaz
Copy link
Member

I've also set up a new PyPI GitHub Environment to match the new deployment name in the workflow. Deployments still need a review, but self-reviews are allowed. The review step is just meant to be a "confirm" button.

@ichard26 lowercase, right?

@ichard26
Copy link
Member

Nope. Is the environment key case-sensitive?

@webknjaz
Copy link
Member

The gzip header includes a timestamp and setuptools doesn't use SOURCE_DATE_EPOCH to force a specific time (pypa/setuptools#2133).

For history, we have this in Ansible post-processing sdists to work around the problem: https://github.com/ansible/ansible/blob/03d6209/packaging/release.py#L867-L899.

@webknjaz
Copy link
Member

Nope. Is the environment key case-sensitive?

@ichard26 I never checked but always assumed that it is. If it's case-sensitive, then running this workflow will just auto-create a new one called pypi without any restrictions pre-configured.

Also, PyPI gets the environment name as a part of the OIDC context. I think it would also be case-sensitive. Although, my understanding is that it's configured correctly on PyPI and GHA, so I expect this bit to be fine.

@ichard26
Copy link
Member

Enter a name for the environment, then click Configure environment. Environment names are not case sensitive. An environment name may not exceed 255 characters and must be unique within the repository.

https://docs.github.com/en/actions/managing-workflow-runs-and-deployments/managing-deployments/managing-environments-for-deployment#creating-an-environment

Seems like it's fine as-is.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Improve the release process to enable trusted publishing
9 participants