Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Gem ownership custodians and a process for dealing with abandoned gems. #33

Closed
wants to merge 3 commits into from

Conversation

ioquatix
Copy link

@ioquatix ioquatix commented Sep 9, 2021

@ioquatix ioquatix force-pushed the ownership-custodians branch from 22ecd84 to 49c0e94 Compare September 9, 2021 00:13
@ioquatix ioquatix changed the title Introduce RFC for gem ownership custodians and a process for dealing with abandoned gems. Gem ownership custodians and a process for dealing with abandoned gems. Sep 9, 2021
@ioquatix
Copy link
Author

ioquatix commented Sep 9, 2021

As part of this, we might also want to consider the inverse case: blocking ownership transfers that look suspicious.

https://mjtsai.com/blog/2018/11/27/popular-npm-package-compromised/

In some cases, authors of popular packages might transfer ownership in a way which produces poor outcomes for the community. We may want to introduce a gem custodian review of gem ownership transfers which could potentially fall into this category.

@mensfeld
Copy link
Member

mensfeld commented Sep 9, 2021

Couple remarks here:

Unresponsive Owners: Not all gem owners provide their contact details, nor is there any guarantee that their contact details will work. This makes it difficult for informal gem ownership trasferral to occur.

Not every owner wants to be reached out to. I've stumbled across this and well, it's their right to do so.

Uncooperative Owners: Unfortunately, even if contactable, some gem owners are uncooperative. This is understandable as the ownership request might come from someone who is unknown to the original owner, and the various issues surrounding gem ownership (including hijacking important gems, etc).

"Forcefull" transfer of ownership is always a high risk procedure. This can be a potential security vulnerability by itself.

The gem custodian can attempt to communicate with the current owner on behalf of the ownership request.

There should be a way to "opt out" from this type of communication either per account of per gem.

A gem is considered "in use" if the gem itself, or any of its direct reverse dependencies (version specific) have more than 10,000 downloads in the past year.

There are semi-private gems that will never reach it. Download count is not the best to assess usage.

A gem is considered "maintained" if the author has logged into https://RubyGems.org within the past 12 months.

Also insufficient. I've seen gems that did not have releases for 7 years but have small updates on Github.

A gem is considered to be "valid" if the homepage and other related metadata is valid (e.g. URLs return relevant pages).

There are literally hundreds of gems without home page, putting rubygems as a home page or with invalid github link (typo, account, etc)

A gem is considered to be "working" if a user can check out and run the unit tests on a supported, non-EOL Ruby implementation.

This would mean we should encourage ppl to add tests into the releases, while for majority of cases it does not make any sense. Also there are many gems that have external deps (DBs, queues, etc) and won't "just run".

A gem is considered to be "stable" if it has a published release >= 1.0.0.

Also not a standard.

A gem is considered "abandoned" if the the most recent release of the gem exceeds the "adoption threshold".

Having so clear definition of that nature means, I could farm this data, create many RG accounts and slowly start taking over many packages to build up a network of packages I could then use.

I would love to come up with a process here, but I feel that we need to have a deeper discussion.

I would start from asking: how often does this happen? I recall two cases for at least a year now. I do agree we need to have policies around this as well as a systematic way but I'm not sure if it should be "popularity" based.

@ioquatix
Copy link
Author

ioquatix commented Sep 9, 2021

I don’t know the number off the top of my head but even the “async” gem was due to private ownership transfer. If it wasn’t for someone’s generosity to give me a name that made sense I might have just given up. A large number of the gems I created and maintain are due to ownership transfer. I feel that because of the lack of name availability, and the lack of transparent process, many developers could give up due to the friction. That’s just my guess though - how would you measure it? The ownership transfer process is not very easy for either party.

I currently have sent out two email messages to gems which have not been updated for many years, to try and reuse them. But after one week did not hear any reply. I don’t think it’s fair to use a shared resource and refuse to have reasonable communications. Even DNS has requirements about contact details which you have to update on a yearly basis.

I agree all your points but that doesn’t mean we don’t need some more transparent process. I think the motivations I provided are real problems. You’d definitely be in a more knowledgeable position to suggest some kind of solution.

Regarding people hosting small gems which are effectively private, is Rubygems, which is a shared community resource, really the right place for it?

I would even argue that taken to its extreme, Rubygems is not sustainable. Over time it seems like it would just get more and more saturated with unmaintained gems. Without any “corrective pressure” the problems that I’ve outlined seem like they will only be worse in the future.

I don’t know what the solution is but I feel a transparent process for dealing with the situations I’ve outlined would be helpful. Maybe having expectations around communication (e.g. sImilar to yearly DNS updates/confirmations) would be another. My goal is to empower people who want to take unmaintained namespace and turn it into maintained namespace.

Ultimately I would like gems which are effectively unmaintained and unused to be open to the community reuse more easily. But maybe this has unacceptable security risks. That being said, maybe we can define a process where the security risk is acceptable and also allows us to avoid these problems, i.e. the bar for any kind of “ownership transfer” should be proportional to the perceived risk, which is what I’ve tried to capture in my proposal.

How often does this happen? I recall two cases for at least a year now

I quickly scanned through the list of gems on rubygems.org that I help maintain (144), for a total of 1.6 billion downloads.

Here are some of the ones which had successful private transfers. For every success there was probably about 1-2x failures (no contact, unwilling to transfer, etc). It's a slow and difficult process.

falcon - private transfer
async - private transfer
db - private transfer
console - private transfer
bake - private transfer
build - private transfer
event - private transfer
docs - private transfer (currently I'm a squatter)
migrate - private transfer
variant - private transfer
live - private transfer
data - private transfer (unusable in Ruby currently)
trace - private transfer
memory - private transfer

I'm not sure if this list is exhaustive since I just eyeballed it.

I would say, I run into this issue several times a year. Probably once every couple of months. I've deliberately tried to choose namespaces which are less likely to generate conflicts so in many cases by my internal naming convention the problem is mitigated.

The list of names above are very nice names (subjectively objective :). Most of them now have actively maintained projects which are valuable to the community at large. Before my effort to do this, they were all unmaintained. Yes, there is value to me personally, but there are others like me who may not have the time or energy to go through this process and we are potentially missing out on their passion and enthusiasm. In essence, the bar to entry has been raised quite a bit because of the volume of unmaintained gems using up namespace. The question is then: "What kind of place do we want Rubygems.org to be?" and there are obviously not one answer - security and reliability is important, but community building and software contribution is also important. Every time I run into this issue, I'm thinking, the people in the late 2000s had a much easier time just choosing whatever name they wanted - it wasn't a big barrier back then as it is now. Maybe I'm technically wrong, but that's how it feels.

Finally, one important point about the above, is that I was transferred multipart-post by the original author. I would like to believe they did their due diligence. But gems with multi-million downloads probably deserve more community scrutiny (i.e. the event-stream problem). Frankly, some of these problems are really hard, but this is where the curator idea comes into play - real humans who are then required to make a real world decision taking into account all the messy edge cases and circumstances. I personally think ANY ownership changes to gems with 1million+ downloads should go through a formal process. I would personally welcome it.

@ioquatix
Copy link
Author

ioquatix commented Sep 9, 2021

Thanks to @nateberkopec who linked me to https://docs.npmjs.com/cli/v6/using-npm/disputes which outlines how NPM solves this problem. Well, they are certainly more direct in their approach.

@Fryguy
Copy link

Fryguy commented Sep 21, 2021

It feels like a lot of the points that were made can be resolved by just having a simple namespace solution. Right now, we are all effectively putting gems into a global namespace. With a namespace, you can use the same name in your own namespace, and then there's not necessarily a need to "transfer" unless there is an official namespace. However, in that case of an official namespace, I'd expect that there is usually some sort of team in place that can take over. Also, with a namespace, you solve the "private gems" problem.

Of course that doesn't solve all of the problems, particularly with dispute resolution, but... baby steps.

Now, how to introduce a namespace is the interesting part. I feel like there could be a transition where the current set of gems are moved into a "global" namespace; gem calls to a non-namespaced version automatically go to the global namespace; perhaps new gems can only go into namespaces; maybe have a way for gem maintainers of global gems to move them into a namespace with a redirect from the global namespace. It will probably take years to fully transition, and then perhaps eventually unmaintained global gems can ultimately move to an "abandoned" namespace.

EDIT: Just noticed this idea is also in #31


- **Unresponsive Owners**: Not all gem owners provide their contact details, nor is there any guarantee that their contact details will work. This makes it difficult for informal gem ownership trasferral to occur.

- **Uncooperative Owners**: Unfortunately, even if contactable, some gem owners are uncooperative. This is understandable as the ownership request might come from someone who is unknown to the original owner, and the various issues surrounding gem ownership (including hijacking important gems, etc).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really don't think "uncooperative owner" would be considered as valid criteria for removal. It has been an unwritten rule that gem names are first come first serve. We really shouldn't delve into the territory where we have to deal with one person's opinion against other about what is considered useful. We should encourage conflict resolution by communication following code of conduct. This is in line with approach taken by pypi and npm

We may be fine with removing/transferring gem names (even if the owner is uncooperative) only if the package is :

malware
illegal
squatting with mostly empty files
using the registry for non-package related things
name, description, or content violates the Code of Conduct

Again derived mostly from list used on PEP 541 and npm disputes.

- Gems which do not meet the requirements for being abandoned require the approval of 3 gem custodians.
- Any gem custodian can veto a gem ownership request.

If this process fails because the current owner rejected the transfer, the ownership request owner can re-request a gem custodian after a period of 12 months. The gem custodian should consider this history in their decision making process.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think similar to PyPI we should have a separate process of gems that will continue the existing project and ones which will remove the existing code/version. I would suggest adding a locking period of 100 days for the latter. In this period the namespace would be empty (all versions yanked) so that any existing users of the gem are aware of removal (and replacement).
We are following the same approach for releasing namespace after complete yank (all versions). The namespace is reserved for 100 days.


For users who search RubyGems to find relevant projects, they are often presented with many options which have not been updated for many years. This presents a bad impression of the Ruby community as "unmaintained". This problem will only get worse over time if no effort is made to address the underlying reasons.

We propose to introduce gem custodians and a process for approving ownership requests where the original owner is unresponsive or uncooperative. This would be an accountable process to deal with abandoned gem namespace and enable valuable gem namespace, which is effectively abandoned, to be reused in a way which is ultimately positive for the Ruby community.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure I understand the importance of a new gem custodians group. rubygems admins should reserve the right to make the final decision.
IMO, the issue with a new group would be ensuring their motives align with rubygems. Further given that most likely it will be a volunteer position, they may not have the bandwidth of response/resolution within acceptable deadlines (which is currently offered by rubygems support).


A gem is considered to be "working" if a user can check out and run the unit tests on a supported, non-EOL Ruby implementation.

A gem is considered to be "stable" if it has a published release `>= 1.0.0`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are great ideal criteria. Unfortunately, the reality is a bit nuanced. I would recommend removing following criteria:

the homepage and other related metadata is valid

92% of all version have empty metadata.

non-EOL Ruby implementation.

more than 80% of traffic we receive is from ruby version which have reached EOL https://ecosystem.rubytogether.org/

it has a published release >= 1.0.0.

rubocop didn't have a stable release for over 8 years.

IMHO, we should stick to activity-related metrics like owner responsiveness and recent releases. Downloads can be a good measure if we filter the noise. Further as of now rubygems.org only tracks total downloads (not downloads over a period).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sonalkr132 we could also map out ownership and owners activities as a factor. I do it already for internal data mining. When owners are active, the change that their packages are not "abandoned" per se is higher. This coulda act as one of the signals.

@sonalkr132
Copy link
Member

Npm disputes is a good reference. I also want to highlight PEP 541 https://www.python.org/dev/peps/pep-0541

@Fryguy
Copy link

Fryguy commented Sep 27, 2021

EDIT: Just noticed this idea is also in #31

Not to derail this thread but

indirect closed this 6 days ago

😕


There are several cases where the current gem ownership request process can be inadequate.

- **Deceased Owners**: Sadly, from time to time, gem owners will pass on. In this case, it is not expected that there would be any response from the owner. We need to consider both popular and unpopular gems. Appropriate mitigating factors (like multiple authors) may not be in place when the unfortunate situation occurs.
Copy link

@svoop svoop Jan 20, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Repo features such as successor settings on GitHub might be helpful here. Given a successor is defined, the communication for transfer would be possible in the event of death without the need for a more laborious process by Rubygems (which should still kick in as a fallback).

@bogdanRada
Copy link

I have a little question about this RFC. I would appreciate if anyone could help me.

Is this RFC only for public repositories on Github? I am assuming yes. Also does it apply only to gems hosted on rubygems.org? I am assuming yes.
My other question is if i move everything i ever published on rubygems.org to a private gem server (possibly using geminabox) and move also all repositories somewhere else..possibly on Bit Bucket or Gitlab.. Does this RFC still apply? I am hoping it will NOT apply in this case. Or perhaps i need to host everything on a private server?

I don't like several things in this RFC so i am just thinking of an alternative to host my gems and repositories. Or perhaps is there a way to opt-out of this RFC?

Thank you very much.

@simi
Copy link
Member

simi commented Jan 20, 2022

@bogdanRada from my understanding this RFC applies to all gems hosted at rubygems.org. There is no connection in between GitHub and this RFC.

I don't like several things in this RFC so i am just thinking of an alternative to host my gems and repositories. Or perhaps is there a way to opt-out of this RFC?

Can you be more specific? Which part you don't like?

@nateberkopec
Copy link

Can this be closed since https://blog.rubygems.org/2022/01/19/rubygems-adoptions.html has been adopted?

@ioquatix ioquatix closed this Jan 23, 2022
@ioquatix ioquatix deleted the ownership-custodians branch January 23, 2022 20:26
@sonalkr132
Copy link
Member

I am not sure we should close this because adoptions was released. Adoptions flow solves the gem transfer/abandoned issue only if the existing owner is responsive.
We do need a policy for actions to take if the owner is unresponsive. IMHO, we should update this RFC and title to address unresponsive owners.

@ioquatix
Copy link
Author

I think a couple of things could be useful to define based on my experience.

  • Owners who are unresponsive.
  • Owners who are squatting = code no longer works, is maintained, or shows any sign of life, but refuse to do anything or give up the name.

@svoop
Copy link

svoop commented Jan 25, 2022

@ioquatix I'd add a third one "Owners who have passed away". While technically covered by "unresponsive", a sudden death (accident etc) doesn't give the owner the opportunity to organize an adoption in time. It's therefore a different enough case to be handled in its own way – e.g. using successor settings on GitHub or other services which will implement similar feature in the future.

@ioquatix
Copy link
Author

@svoop it was already covered in the original proposal. :)

@mullermp
Copy link

mullermp commented Apr 5, 2022

@Fryguy @ioquatix Both of you may be interested in this (mentioned #31), given current discussions here and in other issues. #40

@ioquatix
Copy link
Author

ioquatix commented Apr 5, 2022

@mullermp thanks. How can I help?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants