Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

collaboration guide #163

Merged
merged 6 commits into from
Aug 24, 2021
Merged

collaboration guide #163

merged 6 commits into from
Aug 24, 2021

Conversation

Byron
Copy link
Member

@Byron Byron commented Aug 16, 2021

@avoidscorn @edward-shen What do you think about it? Is anything missing, should something be changed?

Please feel free to comment here or use the 'suggestion' feature of github reviews.

Apologies if the guide sounds a bit robotic or unfriendly but I feel pressed for time and rushed it a bit. However, I am happy to spend some time to make collaborating on the project more straightforward and would love changes to the guide that make it feel more approachable/polished :D.

Thanks

Copy link
Contributor

@edward-shen edward-shen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the recommendation on other crates you don't own mysteriously breaking? e.g. with the mysterious commit size change.

File an issue and don't try and fix it? Try and fix the issue?

COLLABORATING.md Outdated Show resolved Hide resolved
COLLABORATING.md Outdated Show resolved Hide resolved
COLLABORATING.md Show resolved Hide resolved
@avoidscorn
Copy link
Contributor

Some things I'd like to see in COLLABORATING.md:

  1. I assume we should always use PRs when introducing new dependencies, even if they are feature gated? For my work, it would a bloom filter library, and possibly a library that provides pread.
  2. General guidelines on when to use feature gates. So far I have: optional dependencies, pretty vs. lean, async vs sync, parallel support, and remote repository support. What about unsafe vs. 100% safe?

@Byron
Copy link
Member Author

Byron commented Aug 16, 2021

File an issue and don't try and fix it? Try and fix the issue?

Thanks for bringing this up and make me think about this more. I am suggesting to either fix it immediately if the cause is known or to open a PR if it's unclear what's going on. Differentiating this sometimes isn't easy, but I will try my best to do this from now on.

I assume we should always use PRs when introducing new dependencies, even if they are feature gated? For my work, it would a bloom filter library, and possibly a library that provides pread.

On crates you own, I'd think it's still up to you. I usually try to minimize the dependency by deactivating its default features, if possible, and generally try to keep builds fast. This might be for the DEVELOPMENT.md guide.

General guidelines on when to use feature gates. So far I have: optional dependencies, pretty vs. lean, async vs sync, parallel support, and remote repository support. What about unsafe vs. 100% safe?

I hoped that the collaborating guide could be focussed on how to work together, with details on how to setup crates being placed in DEVELOPMENT.md, which might warrant a separate PR and some cleanup. Quick thoughts though:

General guidelines on when to use feature gates…

Additionally I use it to have 'pure' Rust vs bindings with C and assembly.

What about unsafe vs. 100% safe?

I use unsafe only if there is no other way or if it enables far greater performance. So far I only recall one usage of unsafe, and it's to greatly accelerate pack resolution with a tree-like index. Is there any other reason to use unsafe on the level gitoxide is operating?

@avoidscorn
Copy link
Contributor

I use unsafe only if there is no other way or if it enables far greater performance. So far I only recall one usage of unsafe, and it's to greatly accelerate pack resolution with a tree-like index. Is there any other reason to use unsafe on the level gitoxide is operating?

I have no use case for it. I just saw a dependency on sha-1 with the asm feature enabled, so the possibility of unsafe stuck in my mind.

@Byron
Copy link
Member Author

Byron commented Aug 16, 2021

I have no use case for it. I just saw a dependency on sha-1 with the asm feature enabled, so the possibility of unsafe stuck in my mind.

cargo geiger nicely shows unsafe within the dependency tree as well, and…there is a lot. sha1-asm greatly accelerates fundamental git operations, so it's kind of a requirement for most usage. It is turned off though in the pure-rust builds.

Maybe for everything else you can consult and adapt the development guide.

Could @avoidscorn and @edward-shen approve this PR if it's alright to merge for now? We can also adjust it if it turns out to be lacking somewhere.

Thanks.

@kim
Copy link
Contributor

kim commented Aug 16, 2021 via email

@Byron
Copy link
Member Author

Byron commented Aug 18, 2021

tl;dr: Thanks for your feedback, it's much appreciated. I will make adjustments here soon to address transparency and motivation of changes, the ability to comment on them, as well as being less surprising for downstream consumers. Bear with me though, it's happening in parallel of feature development. gitoxide will have to grow out of being my playground into software you can depend on, despite it being very much WIP.

I'm not sure where this initiative is coming from, but I certainly appreciate it.

I am glad to hear that, it was postponed forever in the name of simplicity, even though in truth I just didn't want to deal with it.

However, with gitoxide now being used in radicle-link it's time to transition into a way of working which reduces surprises downstream.

That might be okay, if maintainers
agree that they don't ever look at the git history anyways without online access
to GitHub -- but the context one can extract from GH collaboration facilities
doesn't really carry this information either, and neither does the CHANGELOG.
I'd would thus appreciate if describing a change in prose would be part of the
contribution conventions.

To my mind it's certainly a good idea to try to describe in the commit body what the motivation is.
The workflow around a Kanban board, cards, issues with checkboxes definitely helps me to keep track of things and for now there seems to be no way to do it without GitHub. I did hear about some tool to handle/compile/manage features using text-only and it might even have been a Rust project. Something like that might be usable on top of GitHub facilities to add some transparency in offline scenarios. A pointer at this tool would be appreciated, of course I couldn't find it anymore even though I have probably starred it.

Another observation is that code review is not at all part of the development
process. Exactly because the API surface extends to the lowest levels, breakage
for downstream consumers needs to be considered at virtually every step. For
example, f15f1e8 renamed a trait method -- yes
it was tagged as a breaking change semver-wise, but it will still cause churn
for consumers. And very unnecessarily so: the code is not improved in any way by
this change, on the contrary: turning a method name into a sentence because that
seems subjectively clearer very strongly smells like this abstraction needs some
more fundamental redesign. This could have been caught by asking others for
feedback before committing the change to main.

I agree, breaking changes in plumbing should not be done as light-heartedly as that usually was the case, so I am happy to make PRs for those 'strongly encouraged'. Generally I prefer more semver compatible releases with few breaking changes than few releases with a lot of them, especially in plumbing crates. This makes the inevitable upgrade process easier. In any case, these breaking changes should add value and should be discussed. I will write down my thoughts in something like 'stability.md' and post it as separate PR for discussion. Generally I want to keep process around commits and changes minimal and thus not enforce code reviews, but provide some time to comment on breaking changes affecting downstream.

So I'd strongly suggest to make PR'ing changes mandatory, even for maintainers.

My biggest gripe with this is that I don't want them to be too long lived, so I will experiment with creating PRs instead of the 'small issues that are created from a checkbox in an issue' more, which should make the changes more easily digestible.

Lastly, and somewhat relatedly to the above, I think it's a misunderstanding to
interpret trunk-based development as pushing out every edit (even mistakes) as
they are made. The goal of this philosophy is to minimise breakage by not
allowing the result of a merge commit to escape CI. It's not about bypassing
code review, or discouraging refactoring because every single commit must be
self-contained (as opposed to patch series). Please also consider that it comes
from environments where all downstream consumers are tracked, and thus breaking
changes need to be addressed by the author immediately.

Maybe my love for trunk-based also stems from the lack of process and the simplicity of it - thus far my experience with it is very positive and I could go a month without main being temporarily broken. The main benefit for PRs I see are 'a grouping of commits by topic' and the added transparency with the option to comment and suggest changes. This is also why I want to try a workflow that involves them often, which may as well lead to a PR based workflow. PRs or not, what's merged should not break CI (or the software in case CI thinks it's good 😅), so I am probably missing the point about all downstream customers being tracked.

I am painfully aware of the fact that GH PRs are utterly unsuitable for a
workflow which inherently requires (interactive) rebase. Perhaps worth
considering would be to experiment with bors / homu, which can automate away a
great deal of maintenance chore when many concurrent changes are in-flight.

If fast-forward isn't possible when merging a PR, living with merge-commits seems alright to me, so I'd avoid rebasing branches. With the current amount of concurrent changes that should be quite alright. Using a bot is definitely something that should happen once the PR maintenance burden is too high.

@kim
Copy link
Contributor

kim commented Aug 18, 2021 via email

@Byron
Copy link
Member Author

Byron commented Aug 19, 2021

Organising one's work is a rather ephemeral thing -- nobody will ask you in 5
years what you delivered during Aug 2021. It may however happen that someone is
looking at code that's been written back then and ask themselves why it's been
written like that. It will help them to be able to extract additional context
from the data persisted in the version control system, or at least by following
a link.

I understand that 'keeping track of what happened' isn't necessarily valuable as most people will want to know why something happened or the motivation behind that, the problem to be solved. The latter may be a bit implicit right now yet I think that raw task tracking can be combined an exploration of the problem to solve. There I want to find a lean process that isn't full-blown RFCs yet while yielding more useful information than now may be the case.

I don't agree that this is true in general: every change that breaks my build as
a downstream user consumes my time, and it's not less annoying to spend smaller
amounts of time more frequently, than a larger amount less frequently.

To elaborate, I think it's valuable to have a finer granularity under the assumptions that breaking changes are usually additive (as opposed to undoing a change done earlier). This would allow those who want to get the upgrade on the way but want to minimize time spent as well to upgrade to the next breaking release only, of which they may be more. The work needed there is small though, possibly only a single breaking change. Others might choose to go all in, upgrading to the latest version with more breaking changes to deal with. To me this sounds like win-win, and at least with pre-release software I consider this beneficial. With release software I would avoid that and rather collect breaking changes while still providing granular .alpha releases to allow testing, similar to how it was with pre-release software.

Semver 0. releases are a bit of a lie tbh -- many de-facto stable projects never
reach 1. because, I don't know, they don't have to commit? I think a good
approach are time-based releases, which many larger projects do. You then have a
summary of what happened (or didn't), and there's a committment to supporting
API contracts for N releases.

I agree, and having an understanding of what warrants a major bump from 0 to 1 is important - it's something I would like to have in stability.md as well. From there I like the idea of making time-based major releases (if there are breaking changes) and more regular feature releases. There it's interesting to think about the stabilization process of new features.

When they do too less, they tend to become only trees, no forest.

This one I don't understand, but I would like to. Tiny PRs might certainly be unnecessary overhead, and it certainly takes some experience to estimate how much work it will take to finish one of these alluringly simple checkboxes.

As far as a checkbox-oriented workflow is concerned, I think Rust's "Tracking
issue for RFC xyz" seems to hit a sweet spot in utilising the GH-imposed
tooling. There, I can rather easily tell what both the context and the status
is, and I can subscribe to notifications or not (well, that works to some
extent).

I think this is also what I ended up describing. An issue for the motivation of some feature along with a list of features/tasks that need implementation. These are then links to their respective PRs once one was created, with the issue being closed once all PRs/tasks are done.

Code review is about validating that your solution actually solves the problem, and does so in the best possible way given your current knowledge and understanding.

I absolutely agree, it's where I see untapped potential in gitoxide's development. When moving to PRs, which will happen very soon for me, there will be some time to see what's going on for those who are inclined to get involved.

There's a difference between churn, which everyone downstream has to keep up with, and agility as interpreted as getting many small changes in quickly, forward-fixing mistakes as you go. Striking the right balance is obviously not that easy.

Indeed, and I feel that with the workflow changes manifesting here we will have a handle for tuning this balance. I also don't mind at all to create PRs that fix breaking changes that have been deemed necessary after some PR based collaboration took place.

The result of a merge is not covered by CI in the typical GH flow, that's the thing.

Thanks for pointing this out - I am easily tricked into thinking that a green PR will remain green after merging even though this isn't a given at all. I'd be watching this and as error rates go up a bot probably is the solution in order to facilitate testing the would-be merge commits as well and maintaining merge queues.

And thanks a lot for chiming in, I find your thoughts very helpful and believe there will be a lasting positive effect on the project.

@kim
Copy link
Contributor

kim commented Aug 19, 2021 via email

@Byron
Copy link
Member Author

Byron commented Aug 19, 2021

I realise that you often leave research notes in source code comments. That's
actually quite useful, GHC has employed such a thing for many years [0].

A good point. I am putting these into doc comments usually and maybe they make it into the final documentation in some shape or form. One day all the other meta can be done in repo and will just be presented, instead of being owned by GitHub, I am sure!

I mean, yeah, if you can maintain the discipline that the Semver-constraints are never broken, so the user can pick the point in time to bump, all should be fine in theory.

I solemnly pinky swear that these constraints are to be maintained. It shouldn't even be that hard and I put a lot of faith into cargo smart-release to help with that.

After having read a few articles about trunk-based workflows using systems like Phabricator and Gerrit, I have actually mostly given up on topic branches, and am rebasing everything on a single patch queue branch locally. The remaining nuisance is that GitHub PRs still don't like history rewrites very much, but well...

While there isn't too much going on in main using PRs fortunately won't be much different even without rebasing them. I must admit though that it already feels strange to even have two branches/PRs going right now, definitely some cognitive load, but no good thing is free.

Byron added a commit that referenced this pull request Aug 21, 2021
This is an outcome of the discussion in the PR and should help
to minimize surprises while increasing quality.
@Byron
Copy link
Member Author

Byron commented Aug 21, 2021

Please take a look at another set of changes which aim to incorporate suggestions from the conversation in this PR. This also triggered another note to use the project board to signal planned features and generally what's going on.

My hope is that we can wrap this PR up till Tuesday 2021-08-24.
Thanks

@Byron Byron force-pushed the collaboration-guide branch from 4f5d789 to 6281729 Compare August 21, 2021 11:44
@Byron Byron merged commit d850004 into main Aug 24, 2021
Byron added a commit that referenced this pull request Aug 24, 2021
This is an outcome of the discussion in the PR and should help
to minimize surprises while increasing quality.
@Byron
Copy link
Member Author

Byron commented Aug 24, 2021

Thanks for all your input and involvement. Welcome, collaboration guide :).

@Byron Byron deleted the collaboration-guide branch August 24, 2021 06:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants