Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Privacy Preserving Extensions #6

Open
OR13 opened this issue Nov 16, 2020 · 41 comments
Open

Privacy Preserving Extensions #6

OR13 opened this issue Nov 16, 2020 · 41 comments

Comments

@OR13
Copy link

OR13 commented Nov 16, 2020

@csuwildcat proposed some privacy preserving alterations here: https://github.com/csuwildcat/hashfield

Wondering if there is any appetite to add support for them so we don't see a fork of the spec?

@OR13
Copy link
Author

OR13 commented Nov 16, 2020

@csuwildcat
Copy link

@OR13 have no issue working collaboratively on a spec for this, but there are enough significant modifications/additions that it may warrant distinguishing this construction. I don't think there exists the ability to have backwards compatibility with creds that use the current RL 2020 scheme.

@OR13
Copy link
Author

OR13 commented Nov 17, 2020

@csuwildcat might we consider calling your schema RevocationList2021 and noting its features were designed to plug privacy issues associated with earlier scheme?

Breaking changes don't require the creation of a new spec or software library, we have versioning, we can use it :)

@OR13
Copy link
Author

OR13 commented Nov 17, 2020

ping @csuwildcat @tplooker

@csuwildcat
Copy link

The one general think I dislike about the current spec is the name: it should be called Status List, because literally nothing about it is bound to the bit field indicating only revocation. You can publish different fields for different 'topics', wherein revocation is simply one topic Issuers may choose to express about a credential.

@OR13
Copy link
Author

OR13 commented Nov 17, 2020

@csuwildcat I had not thought about that... very interesting idea to generalize it to topics.

@csuwildcat
Copy link

@OR13 @msporny I no longer believe the dynamic repositioning of the indexes provides sufficient value for the complexity it introduces, so assuming we can add the following as options/additions to this existing spec, can we all just work on this one?:

  1. Advise in the spec as to how to use the construction with a larger bitfield from the start to prevent against lengthening observation (if your use case needs more protection against it)
  2. Specify how/why one would chaff the unused positions in the field, if you wanted to further prevent any aggregate observations of the flow of status activity in the field.
  3. Can we change the name to Status List, given this could be used to express any status 'topic'?

@OR13
Copy link
Author

OR13 commented Nov 17, 2020

Seems like reasonable suggestions to me.

@OR13
Copy link
Author

OR13 commented Nov 18, 2020

I think the first 2 are relatively straight forward:

Advise in the spec as to how to use the construction with a larger bitfield from the start to prevent against lengthening observation (if your use case needs more protection against it)

Pre-initialize a the space? Perhapse a simple example of why this is a problem in the privacy section, and some mitigation language would be sufficient.

Specify how/why one would chaff the unused positions in the field, if you wanted to further prevent any aggregate observations of the flow of status activity in the field.

This belongs in the privacy section as well, including comments on trading storage for privacy. I think an algorithm filling the field could easily be provided, in an extension / appendix.

Can we change the name to Status List, given this could be used to express any status 'topic'?

This one would require the most work IMO, we would want to see some real use cases for the topics, and they may raise additional privacy concerns. Its essentially just a bunch of writing.

@csuwildcat want to take a stab at a PR to address the first 2?

@csuwildcat
Copy link

csuwildcat commented Nov 18, 2020 via email

@OR13
Copy link
Author

OR13 commented Nov 18, 2020

@csuwildcat they could be part of a 'setup phase' which might include planning for the block size, etc... I think we can probably structure the spec to support that.

@dlongley
Copy link
Contributor

Note: I get a 404 when I try to visit: https://github.com/csuwildcat/hashfield

@dlongley
Copy link
Contributor

@csuwildcat,

The one general think I dislike about the current spec is the name: it should be called Status List, because literally nothing about it is bound to the bit field indicating only revocation. You can publish different fields for different 'topics', wherein revocation is simply one topic Issuers may choose to express about a credential.

+1 -- We've already been reusing the bit fields like this internally anyway, so I agree.

@kimdhamilton
Copy link
Contributor

@csuwildcat can you let us see https://github.com/csuwildcat/hashfield? is it a private repo? I get a 404

@csuwildcat
Copy link

@kimdhamilton I got rid of it after I figured out it would require like 10x+ the rounds of hashing than I first thought, which might require 1s of CPU time, and not worth the nascent benefit if we just add the optional random selection and chaffing processes to this more simple/straightforward construction.

@msporny
Copy link
Contributor

msporny commented Nov 21, 2020

@csuwildcat wrote:

Advise in the spec as to how to use the construction with a larger bitfield from the start to prevent against lengthening observation (if your use case needs more protection against it)

Agreed, we might just want to introduce a default algorithm and set the initial lengthening ratio, setting the value to 1.

I'll note that it's important how the chaffing values are inserted and managed, because you can leak information on the chaffing values based on when the bits are flipped. There is information leakage there that we'll have to be careful about.

Specify how/why one would chaff the unused positions in the field, if you wanted to further prevent any aggregate observations of the flow of status activity in the field.

Yep, agreed.

Can we change the name to Status List, given this could be used to express any status 'topic'?

Yes, we should probably make that change... it's really just a status list... however, we may want to have types to define the sort of status. For example, "revocation" is one type of status list... but "active" might be another. That is, some credentials could be issued, but their activation may go on and off based on some schedule... think of a VC that can only be used during business hours in the EU... that's a use case that's not supported by the issuance, expiration, or revocation information. Food for thought.

@ntn-x2
Copy link

ntn-x2 commented Dec 4, 2020

What about tracking by the verifier? If I present the same credential more than once, my credential will probably have the same index in all presentations, meaning that the verifier will know it is the same entity across all the interactions.

@msporny
Copy link
Contributor

msporny commented Dec 4, 2020

What about tracking by the verifier? If I present the same credential more than once, my credential will probably have the same index in all presentations, meaning that the verifier will know it is the same entity across all the interactions.

Yes, this is a concern. It requires collusion among multiple verifiers, and a better tracking mechanism would be to just use the digital signature (for non-pseudonymous digital signature schemes). The goal with this scheme is to prevent issuer-based tracking. Verifiers can still use any unique identifier to track you if they so desire... in those cases, you need to ensure that the entire presentation and all VCs in that presentation provide enough randomness or herd immunity to prevent that sort of tracking (which is a very difficult problem and one that has, arguably, not been solved yet). There is some work in BBS+ that might be applicable here.

@ntn-x2
Copy link

ntn-x2 commented Dec 4, 2020

@msporny thanks for your answer! I would like to make the point that this type of tracking does not necessarily need multiple verifiers to collide. For instance, if I am buying snacks from the same vending machine every day, the vending machine knows that it's always me, as it can correlate the same index used for proof-of-non-revocation. Then, of course, a collusion with multiple identifiers would be even worse, but at least correlation should not be so easy in the single-verifier case at least.
As of today, the only truly privacy-preserving solution to credential revocation are cryptographic accumulators, even though they have other downsides, like constantly updating the delta etc. So I was just curious to know why this problem has not been considered in the analysis in this thread, as I see issuer-based attack and verifier-based attack equally bad.

@msporny
Copy link
Contributor

msporny commented Dec 4, 2020

So I was just curious to know why this problem has not been considered in the analysis in this thread,

It's not that this problem hasn't been considered before... it's been considered for decades and complex cryptographic and other security schemes have been devised to combat the attack vector you describe. It is possible to get to a situation where you're pseudonymous, but then a verifier asks an individual for payment, or an email address and their privacy is blown out of the water.

This is one of the reasons that the Verifiable Credentials specification doesn't assert a position on "one true revocation/status mechanism"... there are a variety of ways to address the issue and each mechanism has benefits and drawbacks. When looking at a broad set of use cases, there is no consensus wrt. the proper revocation mechanism, which tracking risk is more dangerous than the other, or what solution would work in all situations.

What this specification does is provide ONE simple solution that people concerned about issuer tracking (like governments that have strong privacy regulations) can use and compel their vendors to use. It can't be all things to all use cases. Hopefully a technology will come along that is both simple and applicable to a broader range of use cases, and the VC spec purposefully leaves the door open for that to happen.

@dlongley
Copy link
Contributor

dlongley commented Dec 4, 2020

@Diiaablo95,

The ability for a user to commit fraud increases considerably if the user gets to decide whether the verifier knows if one of their credentials is being reused.

Checking for reuse needs to be handled by a witness the verifier trusts, even if the verifier doesn't get to know which credential was reused. A verifier should be able to know (and trust) whether a credential was used in a previous interaction yet the user is declaring a different identity. This enables the verifier to decide if that is acceptable for their use case; sometimes it will be, other times not.

Regardless, this spec isn't designed to directly handle that case. It could be used in a layering fashion, however, to address the particular problem you highlight. For example, the aforementioned trusted witness could perform status checks at the same time that they are checking for reuse and then both pieces of information could be forwarded onto the verifier as new credential(s) asserted by the witness.

@kimdhamilton
Copy link
Contributor

kimdhamilton commented Jan 29, 2021

+1 to @csuwildcat's request, and to avoid making him cry inside.

Questions:

  • In the spec text, do we change from "Revocation List 2020" => "Status List 2021" (why not increment the year?)
  • Are there any concerns about the repo name? I note the "-rl-" in there

I have permissions to update repo-level things if needed.

@csuwildcat
Copy link

Oh, and I did think of a way to do the extra privacy stuff without any difficult rounds of hashing, in just a single pass that doesn't make the resulting encoded string much larger, if we want to discuss that at some point.

@OR13
Copy link
Author

OR13 commented Jan 29, 2021

Can we get a clear proposal for what changes need to be made and where they need to be made?

Here is my attempt:

  • remove the word "revocation" from the spec and implementation, but don't break the interface.
  • update the interface in a newer version to take revocation as a parameter so that existing credentials can be validated with the new version.

@csuwildcat
Copy link

The change set would be:

  1. Change the name so that it's just Status Lists in general
  2. Let the type field reflect the new general name of the spec
  3. Change the revocationListIndex field to be statusListIndex
  4. Change the revocationListCredential field to be statusListCredential
  5. Change the value description of the to reflect a new, more generic credential type: StatusList2020Credential
  6. Add a field named topic, and let the definition state that the value is to be a string that describes the topic of the list (e.g. revocation)
  7. Add the topic field to the StatusList2020Credential, so it is present there too.

I think that's about it, right?

@msporny
Copy link
Contributor

msporny commented Jan 29, 2021

Alternate, but highly aligned, proposal here: w3c-ccg/vc-api#92 (comment)

I will note that there is a large cohort of organizations implementing this specification now for an interop fest in March. I doubt any of them would be happy with the changes being made right now. We'll want to get their input here: @tplooker @OR13 @peacekeeper @mavarley

@OR13
Copy link
Author

OR13 commented Jan 29, 2021

I would be happy to pin the version for interop, and still fix the issue.

@msporny
Copy link
Contributor

msporny commented Jan 29, 2021

I would be happy to pin the version for interop, and still fix the issue.

I suggest we keep this spec as-is, mark it as deprecated (at the top of the spec), and do a new 2021 specification. The current interop cohort would only be expected to implement the old version.

@OR13
Copy link
Author

OR13 commented Jan 29, 2021

I suggest we keep this spec as-is, mark it as deprecated (at the top of the spec), and do a new 2021 specification. The current interop cohort would only be expected to implement the old version.

that works, @msporny are you ok with forking the spec and implementing the changes proposed for 2021? I am happy to do that work.

@csuwildcat
Copy link

Yeah, just doing a new one for the new year-revision would work too, right @msporny?

@tplooker
Copy link

Agree that in general the mechanism for status expression that DB have defined here can be generalized beyond just a binary expression of a particular type of status (e.g revocation), however if we generalize at that layer, how do we communicate the intent of the credential status?

Take for instance

{
  "@context": [
    "https://www.w3.org/2018/credentials/v1",
    "https://w3id.org/vc-revocation-list-2020/v1"
  ],
  "id": "https://example.com/credentials/23894672394",
  "type": ["VerifiableCredential"],
  "issuer": "did:example:12345",
  "issued": "2020-04-05T14:27:42Z",
  "credentialStatus": {
    "id": "https://dmv.example.gov/credentials/status/3#94567",
    "type": "StatusList2020Status",
    "listIndex": "94567",
    "listCredential": "https://example.com/credentials/status/3"
  },
  "credentialSubject": {
    "id": "did:example:6789",
    "type": "Person"
  },
  "proof": { ... }
}

what is the credential status that I would resolve from this credential representing? i.e if I get a 1 back for this credential is that expressing revoked, non-revoked, active in-active, suspended? The semantics of the list must be captured somewhere so that the verifier can understand the statuses intent

will note that there is a large cohort of organizations implementing this specification now for an interop fest in March. I doubt any of them would be happy with the changes being made right now. We'll want to get their input here: @tplooker @OR13 @peacekeeper @mavarley

Yeah -1 to a breaking change that will effect existing implementations, I'm not against the generalization being added as a new revised suite.

@msporny
Copy link
Contributor

msporny commented Jan 29, 2021

that works, @msporny are you ok with forking the spec and implementing the changes proposed for 2021?

Great... done.

https://w3c-ccg.github.io/vc-status-list-2021/

Everyone, please review the PR:

https://github.com/w3c-ccg/vc-status-list-2021/pull/1

@csuwildcat -- it would be really great if DIF could move at the speed that the W3C CCG moves on these things (43 minutes from your request to a fully formed specification). When do you guys think you're going to be able to move at that speed? :P

@tplooker
Copy link

tplooker commented Jan 29, 2021

Also if we are generalizing is it appropriate to consider beyond just binary status representation, for example could a credential occupy more than 1 bit in a status list therefore giving you an enumeration greater than two states for more advanced status expressions?

@tplooker
Copy link

A concrete example is we have had requests for a tri-state credential status expression where by a credential needs to be active, in-active or suspended. Therefore if each credential occupied two bits you would have 4 expressible states for the credential

@kimdhamilton
Copy link
Contributor

@jchartrand, see @tplooker's comment about using this approach beyond binary states, which we were discussing.

@OR13
Copy link
Author

OR13 commented Jan 29, 2021

I would propose NOT tackling states beyond binary in status-2021, and instead making YET ANOTHER spec for that...

could state-list-2021 etc... status should remain binary.

@msporny
Copy link
Contributor

msporny commented Jan 29, 2021

I would propose NOT tackling states beyond binary in status-2021

I agree with @OR13 -- we don't want the spec to turn into a swiss army knife.

That said, we might be able to accomplish this with a "bitWidth" entry that defaults to '1', but could be any arbitrary number... 3, 4, 8, etc. The only thing that really changes is the calculation of where you want to look in the bit string. Run length compression might suffer with a bunch of 01001101s, but it would probably just be a function of the bit width... you'd just have to make sure your default would result in long binary strings of 0s, 1s, 01s, or 10s... something that repeats at a regular basis.

Once challenge with non-binary states is that you then have to know what each state means (because the verifier needs to know)... and then you probably have to communicate yet another mapping of bitstring to logical state. Seems tenuous.

Alternatively, you could just use another status list... you want multiple states? Use two different status lists... there's nothing preventing you from doing this:

  "credentialStatus": [{
    "id": "https://dmv.example.gov/credentials/status/3#94567",
    "type": "SuspensionList2020",
    "statusListIndex": "94567",
    "statusListCredential": "https://example.com/credentials/status/3"
  }, {
    "id": "https://dmv.example.gov/credentials/status/3#94567",
    "type": "RevocationList2020",
    "statusListIndex": "94567",
    "statusListCredential": "https://example.com/credentials/status/4"
  }],

@msporny
Copy link
Contributor

msporny commented Jan 29, 2021

@tplooker wrote:

how do we communicate the intent of the credential status?

Yep, that's a problem... we'll have to expose the type of status list it is in the VC... not a big issue, we just need to create a few new types.

@OR13, sounds like a great use case for a registry -- I hear you love and are expert at creating and maintaining those things. We could have a whole registry for credential status types, and a governance process around it, and a council of elders to weigh in on things like copyright violations and moral dilemmas created by the types of lists the registry maintains. :P

I was joking above... until I realized that we already have a VC Extension Registry, and that the types of status lists we're talking about are going to have to end up there. :((((

@tplooker
Copy link

Yep, that's a problem... we'll have to expose the type of status list it is in the VC... not a big issue, we just need to create a few new types.

Ok great, we are aligned on this and just to be clear if we do extend it so the bit width can be greater than 1 than so long as these semantic expressions extend to define the mapping of what the possible states mean, then I think we have a workable solution.

@csuwildcat
Copy link

that works, @msporny are you ok with forking the spec and implementing the changes proposed for 2021?

Great... done.

https://w3c-ccg.github.io/vc-status-list-2021/

Everyone, please review the PR:

w3c-ccg/vc-status-list-2021#1

@csuwildcat -- it would be really great if DIF could move at the speed that the W3C CCG moves on these things (43 minutes from your request to a fully formed specification). When do you guys think you're going to be able to move at that speed? :P

People literally asked me not to just go do this in an hour months ago when I asked, so careful what you wish for ;)

@peacekeeper
Copy link
Member

+1 to this. I like both suggested approaches (multiple state lists vs. single list with multiple bits per state).

I guess one difference is that with multiple lists you have to answer the question what to do if multiple states are set (e.g. "revoked" AND "suspended" are set to 1), whereas with multiple bits per state you can give each combination its own meaning (e.g. 00=active, 01=suspended, 10=revoked, 11=disputed, etc.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants