-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Exploratory refactor-2 of libp2p + HTTP #103
Conversation
This is another version of the libp2p + HTTP refactor in #102. The key difference is that a new `NamespacedClient`, having the publisher's address, is created for each sync within the per-sync Syncer.
d2700c9
to
bc6a41f
Compare
Codecov ReportPatch coverage:
❗ Your organization is not using the GitHub App Integration. As a result you may experience degraded service beginning May 15th. Please install the Github App Integration for your organization. Read more. Additional details and impacted files@@ Coverage Diff @@
## main #103 +/- ##
==========================================
+ Coverage 54.54% 54.69% +0.15%
==========================================
Files 65 65
Lines 5152 5209 +57
==========================================
+ Hits 2810 2849 +39
- Misses 2039 2054 +15
- Partials 303 306 +3
☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is great! I agree with the idea of putting the httpHost in the sync and creating the http.Client internally with NewSyncer.
I added some notes (some for myself). But I have one API design question I'd like feedback on around server auth.
// libp2p streams, and multiaddrs to create listeners on. | ||
publisherHost, err := libp2phttp.New( | ||
libp2phttp.StreamHost(publisherStreamHost), | ||
libp2phttp.ListenAddrs([]multiaddr.Multiaddr{multiaddr.StringCast("/ip4/127.0.0.1/tcp/0/http")}), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This might be is obvious, but I forgot to note this here: In production this MUST be an https
endpoint. It's not here to make testing a bit easier.
dagsync/httpsync/sync.go
Outdated
for i := range peerAddrs { | ||
// | ||
// TODO: Replace arguments with peer.AddrInfo | ||
func (s *Sync) NewSyncer(peerID peer.ID, addrs []multiaddr.Multiaddr) (*Syncer, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe leave a comment that peer.ID is optional for the HTTP transport. In general, auth is optional on HTTP.
dagsync/httpsync/sync.go
Outdated
sync: s, | ||
} | ||
|
||
if peerInfo.ID != "" { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
peerInfo.ID may be nil to this argument. It means we won't cache the peer metadata in the http host.
I'm thinking I need to change this. We should be able to cache this based on the endpoint (scheme+host+port+sni).
} | ||
|
||
func (s *Sync) newLibp2pSyncer(peerInfo peer.AddrInfo) (*Syncer, error) { | ||
httpClient, err := s.clientHost.NamespacedClient(s.protoID, peerInfo) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the best go-libp2p api to expose optional server authentication to users?
My original idea was to determine if a user wanted to authenticate the server if peerInfo.ID
was set. The problem is that this can be very annoying to users because:
* A user might not know if the multiaddr they are passing is an HTTP transport or Stream transport.
* A user might be passing multiple multiaddrs (both an HTTP transport and a stream transport), and stream transports require a server ID (libp2p streams MUST be authenticated). In the case that a user doesn't need server peer id auth, they have no way of expressing that. They'll pay for HTTP server auth because they passed it in (since it is required for the stream transports).
Now I'm thinking we introduce a new option to the RoundTripper/http.Client functions. Something like ServerMustAuthenticate
. Users would pass this option to tell the http host that it MUST authenticate the server (either via HTTP Peer ID Auth or stream auth). If this is not passed in, we can assume the client doesn't care if the server's peer id has been authenticated and we can ignore the passed in peerInfo.ID
.
Wdyt?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think having an option to allow authentication to be bypassed works. This is also not an unfamailiar thing to do, as it is similar to tls.Config.InsecureSkipVerify
if err != nil { | ||
return nil, err | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a note, if we see an httpath
component we can tell the http host about this mapping via httpHost.AddPeerMetadata
. That would save us a round trip for .well-known/libp2p
if we have it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@MarcoPolo This would work for the /head
path since it does not change (i.e. is well-known), but will not work for all the advertisement and entries paths, since they all end with a CID that is almost always different. It would be awesome if there were a way to match well-known path prefixes (e.g. /ipni/v1/ad/
) to a protocol ID. Then the cache could do a longest prefix match to lookup the metadata.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe I'm misunderstanding something. In the current version of master we have the notion of a syncer's rootURL. This represents the prefix that the IPNI protocol is mounted at. If we want to GET the head we do rootURL + "/head"
and if we want a cid we do rootURL + cid.String()
.
I'm noting here that we could save a round trip to .well-known/libp2p telling us the prefix by using the rootURL. We would do something like clientHost.AddPeerMetadata(peer.id, libp2phttp.WellKnownProtoMap{"/ipni/v1": rootURL}
.
You probably need to do this for backwards compatibility as well as an optimization, since existing deployed publishers won't have set up a /.well-known/libp2p
endpoint yet.
I don't quite understand why you'd want to do a prefix match, what am I missing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am probably not full understanding the correct use. The code immediately following this comment is my attempt to use this feature. Maybe I need to omit the "head" or any other thing following the root prefix?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah okay. I think I see. You don't actually need to do any of the below. The path in WellKnownProtocolMeta
is the path prefix for an application protocol. So for IPNI it would be something like /ipni/1
(or just /ipni
). It's the path prefix that defines where this application protocol is mounted on the server. Let me create a patch of what I mean.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
https://gist.github.com/MarcoPolo/30c5c138a126473a949ab1879c16c37a
We learn about this prefix out-of-band by using the httpath
in the multiaddr if it's present.
|
||
// libp2phttp | ||
clientHost *libp2phttp.HTTPHost | ||
protoID protocol.ID |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can probably be a constant value here. It is simply some identifier for this protocol. Maybe /ipni-sync/1
?
Currently I think I might be the only user of So with this change we would have have 3 tools for namespacing:
Which is nice, and I'm less attached to But, with my use-case I reuse the http server, so using this libp2p solution means I have to reengineer my server to do everything through libp2p, right? @MarcoPolo, perhaps you could consider how https://github.com/ipld/frisbii is using this if you want to explore the usability of this (not asking for a PR, just that frisbii is very particular about its use of go-libipni in a way that's currently (probably) not compatible with this new API). |
My goal is that you don't have to re-engineer anything if you're already building on top of HTTP semantics. You don't even have to use the go-libp2p package to be compatible with clients who are using it. Using the go-libp2p package is mostly useful if:
(these are copied from the WIP HTTP spec) I took a look at Frisbii, I think there's only one thing to add if you want to support the |
Closing in favor of #113 |
This is another version of the libp2p + HTTP refactor in #102. The key difference is that a new
NamespacedClient
, having the publisher's address, is created for each sync within the per-sync Syncer.