Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(identify): implement signedPeerRecord #5785

Open
wants to merge 11 commits into
base: master
Choose a base branch
from

Conversation

drHuangMHT
Copy link
Contributor

@drHuangMHT drHuangMHT commented Dec 31, 2024

Description

May close #4017.

Notes & open questions

pb-rs now uses Cow<'_,T> for the compiled Rust structs. But with borrowed type in the struct, FramedRead can no longer process frames correctly(trait bound not statisfied).

Change checklist

  • I have performed a self-review of my own code
  • I have made corresponding changes to the documentation
  • I have added tests that prove my fix is effective or that my feature works
  • A changelog entry has been made in the appropriate crates

@drHuangMHT
Copy link
Contributor Author

drHuangMHT commented Dec 31, 2024

Oh CI will compile the protobuf code, didn't see that.
EDIT: Wait how does this work?

Copy link
Member

@dariusc93 dariusc93 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left some comments :)

protocols/identify/src/behaviour.rs Outdated Show resolved Hide resolved
protocols/identify/src/behaviour.rs Outdated Show resolved Hide resolved
protocols/identify/src/behaviour.rs Outdated Show resolved Hide resolved
@getong
Copy link

getong commented Jan 3, 2025

you might rebase to main branch, as zlib license has added to ci, #5769

@getong
Copy link

getong commented Jan 3, 2025

Files in /home/runner/work/rust-libp2p/rust-libp2p/protocols/identify have changed, please write a changelog entry in /home/runner/work/rust-libp2p/rust-libp2p/protocols/identify/CHANGELOG.md

I think you might write some changelog text into this file, just like,

implement signedPeerRecord

the pr might be okay now.

@drHuangMHT
Copy link
Contributor Author

the pr might be okay now.

Thanks for the review! Working on tests right now.

@drHuangMHT
Copy link
Contributor Author

version 0.45.1 of libp2p-identify hasn't been released on crates.io yet, I'm not sure whether I should add a changelog entry right now.

@drHuangMHT drHuangMHT marked this pull request as ready for review January 4, 2025 05:27
@elenaf9
Copy link
Contributor

elenaf9 commented Jan 9, 2025

version 0.45.1 of libp2p-identify hasn't been released on crates.io yet, I'm not sure whether I should add a changelog entry right now.

CHANGELOG will be fixed with #5803, you can just add the new entry to 0.46.0.

Copy link
Contributor

@elenaf9 elenaf9 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @drHuangMHT! Looks like a great start.

I found a pending libp2p spec for signed peer records: libp2p/specs#630. Most of it matches this PR, just one thing is missing:

If the signedPeerRecord is present the implementation MUST use the data contained within it and ignore duplicated fields present in the main identify message

IMO that makes sense and should be added to this PR. So instead of copying the received record as it is in handle_incoming_info, I think we should instead (when an signed envelope is present):

  1. Verify that the signature matches the remote's public key
  2. Deserialize the record, store the resulting addresses as listen_addrs, and ignore the original listen_addrs.

Wdyt?

protocols/identify/src/behaviour.rs Outdated Show resolved Hide resolved
protocols/identify/src/behaviour.rs Outdated Show resolved Hide resolved
Comment on lines 233 to 242
// When signedPeerRecord contains valid addresses, ignore addresses in listenAddrs.
// When signedPeerRecord is invalid or signed by others, ignore the signedPeerRecord(set to `None`).
let (signed_peer_record, listen_addrs) = signed_peer_record
.as_ref()
.and_then(|envelope| PeerRecord::try_deserialize_signed_envelope(&envelope).ok())
.and_then(|(envelope_public_key, _, _, addresses)| {
(*envelope_public_key == public_key).then_some(addresses)
})
.map(|addrs| (signed_peer_record, addrs))
.unwrap_or_else(|| (None, parse_listen_addrs(msg.listenAddrs)));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we really need to deserialize the record again? Can't we just read out PeerRecord::adddresses:

let listen_addrs = signed_peer_record.map_or_else(
     || parse_listen_addrs(msg.listenAddrs), 
     |record| record.addresses.to_vec()
);

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahh we also need to check that the envelope key matches, but that can be done as well by reading the PeerRecord::peer_id and comparing it with the remote's id, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we really need to deserialize the record again? Can't we just read out PeerRecord::adddresses:

let listen_addrs = signed_peer_record.map_or_else(
     || parse_listen_addrs(msg.listenAddrs), 
     |record| record.addresses.to_vec()
);

record has type SignedEnvelope, whose payload is in the form of Vec<u8>(bytes) that doesn't have addresses field unless we deserialize the payload into PeerRecord. In order to own the addresses the only way is to call record.addresses().to_vec() which in this case allocates twice, which is unnecessary. So I extracted PeerRecord::try_deserialize_signed_envelope to cut down allocation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahh we also need to check that the envelope key matches, but that can be done as well by reading the PeerRecord::peer_id and comparing it with the remote's id, right?

PeerId is derived from a public key, which indirectly proves its identity while the key itself does so directly. ID can collide, but the key is less likely, considering we support multiple key types. I believe it is safer to compare the keys directly when they are already present.

Copy link
Contributor

@elenaf9 elenaf9 Feb 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the late reply!


record has type SignedEnvelope, whose payload is in the form of Vec<u8>(bytes) that doesn't have addresses field unless we deserialize the payload into PeerRecord.

Ah I see. The name signed_peer_record had me confused. How about calling it signed_envelope instead, so it matches the type (also in Info)?

In order to own the addresses the only way is to call record.addresses().to_vec() which in this case allocates twice, which is unnecessary. So I extracted PeerRecord::try_deserialize_signed_envelope to cut down allocation.

Good point. Still, I would expect that the compiler optimizes away the unnecessary allocation. IMO the current code is a bit difficult to understand, and for the sake of simplicity it would be worth it to do the conversion to PeerRecord so that we can then simply clone PeerRecord::addresses.
However, if you have a strong preference for the current implementation I am okay with it. But then I think PeerRecord::try_deserialize_signed_envelope logic better fits as SignedEnvelope::try_deserialize(&self) -> ... (same logic, just move to SignedEnvelope).

PeerId is derived from a public key, which indirectly proves its identity while the key itself does so directly. ID can collide, but the key is less likely, considering we support multiple key types. I believe it is safer to compare the keys directly when they are already present.

We rely on the collision resistance of PeerId throughout all of rust-libp2p, and go-libp2p also only compares IDs here. I'm not against comparing public keys, but given that PeerRecord::try_deserialize_signed_envelope already compares the PeerIds I don't think we need the double check here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

However, if you have a strong preference for the current implementation I am okay with it. But then I think PeerRecord::try_deserialize_signed_envelope logic better fits as SignedEnvelope::try_deserialize(&self) -> ... (same logic, just move to SignedEnvelope).

I don't think it is a good idea to implement deserialization on SignedEnvelope because it can contain arbitrary payload, including but not limited to PeerRecord. We need specific "headers" to properly deserialize the envelope, according to the spec.

We rely on the collision resistance of PeerId throughout all of rust-libp2p, and go-libp2p also only compares IDs here.

That's fair. Good to know.

I'm not against comparing public keys, but given that PeerRecord::try_deserialize_signed_envelope already compares the PeerIds I don't think we need the double check here.

The two checks are different. In try_deserialize_signed_envelope it checks whether the signer of the record matches the signer of the envelope. While in Info::try_from we essentially check the signer of the envelope(therefore signer of the address record) against the sender of the identify message. If we skip the check in try_from, the address can come from peers other than the message sender.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

identify: implement SignedPeerRecord
4 participants