Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HTTP retrieval proposal #747

Open
wants to merge 43 commits into
base: main
Choose a base branch
from
Open

HTTP retrieval proposal #747

wants to merge 43 commits into from

Conversation

hsanjuan
Copy link
Contributor

@hsanjuan hsanjuan commented Dec 9, 2024

This is a proposal to add HTTP retrieval to Boxo. The current state is highly WIP, but I successfully retrieved something over HTTP, so posting to initiate a discussion over the approach and if we want to pursue it until the end.

Approach

The high-level idea is that most of what lives in bitswap/client is actually an "exchange" implementation, with the only real "Bitswap" thing being that bitswap/network sends HAS/GET requests over bitswap-protocol streams. As such, we should be able to complement bitswap/network with an HTTP-retrieval implementation which, instead of fetching things over the bitswap protocol, calls HTTP endpoints as indicated by the provider's /http addresses entries.

Note that conceptually at least, this is not adding HTTP retrieval into bitswap, but promoting most of the bitswap code to be a reference "Exchange" implementation, which is re-usable for different retrieval protocols (bitswap, http...). That is, we would be talking of an "exchange network" component and not a "bitswap network" component. Renames to this extent are still missing.

Implementation

In order to introduce an http-retrieval "exchange network" we need to:

  • Know when something should be retrieved via HTTP - that is, an item has an /http provider.
  • Use HTTP network for that.

To this end:

  • We have a router which select the http-network or the bitswap-network (or both) based on the existance of /http addresses in the peerstore of the given peer.
  • We have implemented an http-network as a PoC that performs GET requests to /http endpoints when handling a WANT.

image

In my tests plugging it to Kubo, the http-network can be used to retrieve content from a gateway over http. 🥳

The main advantange to this approach is that it is relatively clean to incorporate to the codebase, and keeps most of the code untouched, without having to duplicate any of the complex areas.

Challenges

  • Connectivity tracking is not implemented yet and we will have to see to what extent it can be implemented (I'm guessing we can plug into the TCP dialer directly).
  • Options like timeouts etc. are not implemented
  • We use a single HTTP client rather than a pool
  • Of course testing is fully lacking.

Bitswap places a lot of importance on managing connectivity events to peers. We avoid requesting things from peers that have not signaled connectivity, we clean peers that have disconnected and re-queue things for peers that disconnect. Thus it seems we must support http-connectivity events. When a libp2p peer connects for bitswap, we know that the connection is setup, handshake has been performed and protocol negotiation has happened. For HTTP these things may not exist so we need to define what means "Connected" (i.e. in the case of https it would mean we have completed SSL handshakes).

Apart from that, the question is what are the elements in the current bitswap/client stack that do not apply to HTTP (peerqueues, messagequeues, broadcast, wantsending, prioritization etc.)... and why not? What if a peer disconnects from bitswap but not from http or vice-versa? What if Latency is much worse for bitswap than for http? Perhaps this is all logic for the network-router to know how to choose which network to use to send messages.

Otherwise perhaps it is not possible to have a satisfactory implementation this way and we need to start thinking what to copy-paste into a separate "http-exchange" (at least the client part).

Related: #608

@hsanjuan hsanjuan self-assigned this Dec 9, 2024
@hsanjuan hsanjuan requested a review from a team as a code owner December 9, 2024 19:44
Copy link
Member

@lidel lidel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @hsanjuan, would be extremely nice if we can pull it off with such small set of changes.

Once we have HTTP basics like user-agent, status code metrics, 503/429/Retry-After (details inline), this is worth testing on Rainbow staging (do A/B test with bitswap-only box and bitswap+http).

ps. Whatever we do, HTTP should be opt-in, with a big EXPERIMENTAL warning.

bitswap/network/http_multiaddr.go Outdated Show resolved Hide resolved
bitswap/network/http_multiaddr.go Outdated Show resolved Hide resolved
bitswap/network/httpnet/httpnet.go Outdated Show resolved Hide resolved
bitswap/network/httpnet/httpnet.go Outdated Show resolved Hide resolved
bitswap/network/httpnet/httpnet.go Outdated Show resolved Hide resolved
bitswap/network/httpnet/httpnet.go Outdated Show resolved Hide resolved
bitswap/network/httpnet/httpnet.go Outdated Show resolved Hide resolved
bitswap/network/httpnet/httpnet.go Outdated Show resolved Hide resolved
bitswap/network/router.go Show resolved Hide resolved
bitswap/network/httpnet/httpnet.go Show resolved Hide resolved
Copy link

codecov bot commented Jan 13, 2025

Codecov Report

Attention: Patch coverage is 56.07401% with 546 lines in your changes missing coverage. Please review.

Project coverage is 60.27%. Comparing base (3aa3bee) to head (08419db).

Files with missing lines Patch % Lines
bitswap/network/httpnet/httpnet.go 54.20% 137 Missing and 10 partials ⚠️
bitswap/network/router.go 0.00% 135 Missing ⚠️
bitswap/network/httpnet/pinger.go 14.65% 98 Missing and 1 partial ⚠️
bitswap/network/httpnet/msg_sender.go 73.69% 78 Missing and 13 partials ⚠️
bitswap/network/httpnet/cooldown.go 65.57% 20 Missing and 1 partial ⚠️
bitswap/network/bsnet/ipfs_impl.go 28.57% 20 Missing ⚠️
bitswap/network/http_multiaddr.go 78.37% 11 Missing and 5 partials ⚠️
bitswap/network/httpnet/metrics.go 83.09% 12 Missing ⚠️
bitswap/network/httpnet/request_tracker.go 95.31% 2 Missing and 1 partial ⚠️
bitswap/testnet/virtual.go 80.00% 2 Missing ⚠️

Impacted file tree graph

@@            Coverage Diff             @@
##             main     #747      +/-   ##
==========================================
- Coverage   60.48%   60.27%   -0.22%     
==========================================
  Files         244      252       +8     
  Lines       31101    32303    +1202     
==========================================
+ Hits        18811    19470     +659     
- Misses      10615    11123     +508     
- Partials     1675     1710      +35     
Files with missing lines Coverage Δ
bitswap/client/client.go 86.61% <100.00%> (ø)
bitswap/client/internal/peermanager/peermanager.go 91.79% <100.00%> (-0.07%) ⬇️
bitswap/network/bsnet/options.go 50.00% <ø> (ø)
bitswap/network/connecteventmanager.go 88.54% <100.00%> (+2.29%) ⬆️
bitswap/server/server.go 55.37% <100.00%> (ø)
bitswap/testinstance/testinstance.go 86.44% <ø> (ø)
bitswap/testnet/peernet.go 38.46% <100.00%> (ø)
examples/bitswap-transfer/main.go 41.21% <ø> (ø)
bitswap/testnet/virtual.go 70.38% <80.00%> (+1.96%) ⬆️
bitswap/network/httpnet/request_tracker.go 95.31% <95.31%> (ø)
... and 8 more

... and 11 files with indirect coverage changes

@hsanjuan hsanjuan force-pushed the http-retr2 branch 2 times, most recently from 5a73303 to c6a1b06 Compare January 16, 2025 17:54
@hsanjuan hsanjuan requested a review from a team January 16, 2025 17:54
Copy link
Contributor Author

@hsanjuan hsanjuan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Self-review.

@@ -191,7 +191,7 @@ func New(parent context.Context, network bsnet.BitSwapNetwork, providerFinder Pr

sim := bssim.New()
bpm := bsbpm.New()
pm := bspm.New(ctx, peerQueueFactory, network.Self())
pm := bspm.New(ctx, peerQueueFactory)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Turns out, peer ID is not used in the peermanager.

Comment on lines +1 to +14
package bsnet

import "github.com/ipfs/boxo/bitswap/network/bsnet/internal"

var (
// ProtocolBitswapNoVers is equivalent to the legacy bitswap protocol
ProtocolBitswapNoVers = internal.ProtocolBitswapNoVers
// ProtocolBitswapOneZero is the prefix for the legacy bitswap protocol
ProtocolBitswapOneZero = internal.ProtocolBitswapOneZero
// ProtocolBitswapOneOne is the prefix for version 1.1.0
ProtocolBitswapOneOne = internal.ProtocolBitswapOneOne
// ProtocolBitswap is the current version of the bitswap protocol: 1.2.0
ProtocolBitswap = internal.ProtocolBitswap
)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved from network, which now is more an "exchange" network with interfaces and common utils.

@@ -1,4 +1,4 @@
package network
package bsnet
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes to this file and others in the module are cosmetic, renames...

@@ -20,7 +23,7 @@ const (
stateUnresponsive
)

type connectEventManager struct {
type ConnectEventManager struct {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ConnEventManager has been extracted from bsnet, it is now re-used in httpnet, therefore exposed. Changes are otherwise cosmetic.

bitswap/network/http_multiaddr.go Outdated Show resolved Hide resolved
bitswap/network/router.go Outdated Show resolved Hide resolved
bitswap/network/router.go Outdated Show resolved Hide resolved
bitswap/network/httpnet/httpnet.go Show resolved Hide resolved
bitswap/network/router.go Outdated Show resolved Hide resolved
bitswap/network/router.go Outdated Show resolved Hide resolved
Copy link
Contributor

@guillaumemichel guillaumemichel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To me, Bitswap is the name of the protocol (and client behaviour) and it can use either HTTP or libp2p to communicate with remote peers. But then HTTP servers don't exactly follow Bitswap spec since they don't comply with CANCEL messages (see comment).

My suggestion would be to name the folders network/libp2p and network/http.

bitswap/network/httpnet/httpnet.go Outdated Show resolved Hide resolved
bitswap/network/httpnet/httpnet.go Show resolved Hide resolved
bitswap/network/httpnet/httpnet.go Outdated Show resolved Hide resolved
bitswap/network/httpnet/httpnet.go Outdated Show resolved Hide resolved
bitswap/network/httpnet/pinger.go Outdated Show resolved Hide resolved
bitswap/network/httpnet/msg_sender.go Show resolved Hide resolved
case entry.WantType == pb.Message_Wantlist_Block:
method = "GET"
case entry.WantType == pb.Message_Wantlist_Have:
method = "HEAD"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WANT_HAVE requests are part of Bitswap 1.2 (PR), so I think it is worth it to support HEAD requests. It is the newer bitswap that introduces WANT_HAVE, and HAVE messages.

If some endpoints consistently throw 500s on HEAD, we can stop sending these and focus on GET. We should look into the behaviour of a client running bitswap 1.2 and server running bitswap 1.0 or 1.1 and try to replicate.

WANT_HAVE represent a large share of all bitswap messages (see bitswap study), and play a key role in bitswap content routing (the bitswap spamming...).

Comment on lines +189 to +187
case entry.Cancel:
// log.Debugf("received cancel entry for %s: %s", u.url, entry.Cid)
sender.ht.requestTracker.cancelRequest(entry.Cid)
return nil // cont with next block
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Writing comment here to enable discussion:

CANCEL messages are part of Bitswap spec since version 1.0.0, and the client's behaviour depends on it. E.g it will not ask twice the same peer about the same CID, since it expects that the peer would remember the interest.

The spec says that Bitswap clients SHOULD send CANCEL messages, so I guess it is okay if it doesn't send CANCEL to HTTP server (anyway this makes no sense). The specs doesn't mention that peers should record the wantlist of connected nodes. However, since there is only 1 bitswap server implementation (deployed, to my knowledge), it was easy to assume that all bitswap peers would record connected nodes wantlists, and to rely on that when designing the client.

Since this PR allows the Bitswap client to talk Bitswap with HTTP servers, it means that not all Bitswap servers will be able to record connected nodes wantlist anymore.

Maybe the Bitswap client behaviour needs to be adjusted to reflect that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adjusted how though? We cannot get a callback from the http server when it obtains blocks from somewhere else that we wanted.

I think the client code compensates for this by essentially broadcast-retrying. It may be slower though in some scenario. That said, we are still running bitswap on the side and in parallel, so bitswap-servers contacted for other reasons will still have this wantlist visibility i guess?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe the client can send WANT_HAVE more aggressively to HTTP nodes compared with libp2p nodes?

Otherwise we don't compensate, and we document clearly where required that bitswap+libp2p will have a better performance for content routing (not necessarily content retrieval) compared with bitswap+http.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to sync about this, as it affects the same-peer-id vs. different-peer-id for bitswap/http endpoints. Perhaps we need to notify the bitswap server about wantlists that are requested over http just so it knows about them. I'm not familiar with the bitswap server code though, so not sure what happens there.

}

// New returns a BitSwapNetwork supported by underlying IPFS host.
func New(pstore peerstore.Peerstore, bitswap BitSwapNetwork, http BitSwapNetwork) BitSwapNetwork {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe add flag to specify whether http or libp2p should be preferred by default.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to benchmark, but I can imagine bitswap+libp2p being more efficient that bitswap+http for large wantlists, since each entry is a distinct http request, but the full wantlist fits in a single libp2p message.

Copy link
Contributor Author

@hsanjuan hsanjuan Jan 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently wantlists are sent in a single stream, but we wanted to change it into one stream per request (in pure bitswap). Realistically very difficult to benchmark. A gateway with CDN caching will beat bitswap for sure. I don't know about a Kubo gateway, probably depends a lot on the shape of requests (as you say, not the same to send a wantlist of 1 than to send a wantlist of 200).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think bitswap+http will beat bitswap+libp2p at pure content retrieval (e.g you already know the address of the provider thank to DHT or IPNI), but for content routing+retrieval (broadcasting WANT_HAVE) bitswap+libp2p is expected to find+fetch the content faster than bitswap+http.

Maybe http+bitswap could prioritize sending WANT_BLOCK over WANT_HAVE, since the client sends WANT_BLOCK only if it thinks the remote has a good probability of having the block, whereas it literally broadcast the WANT_HAVE.

case entry.WantType == pb.Message_Wantlist_Block:
method = "GET"
case entry.WantType == pb.Message_Wantlist_Have:
method = "HEAD"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since each request targets a specific CID, and requests are sent sequentially, it means that the http server will be spammed if the wantlist contains 100's of entries (each corresponding to a WANT_HAVE message). The number of WANT_BLOCK is expected to be much smaller.

Maybe due to the sequential requests nature of the message sender, it is best not to send WANT_HAVE after all, even if it means that the bitswap+http doesn't do what it SHOULD from the spec.

bitswap/network/httpnet/msg_sender.go Outdated Show resolved Hide resolved
bitswap/network/httpnet/msg_sender.go Show resolved Hide resolved
Comment on lines +189 to +187
case entry.Cancel:
// log.Debugf("received cancel entry for %s: %s", u.url, entry.Cid)
sender.ht.requestTracker.cancelRequest(entry.Cid)
return nil // cont with next block
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe the client can send WANT_HAVE more aggressively to HTTP nodes compared with libp2p nodes?

Otherwise we don't compensate, and we document clearly where required that bitswap+libp2p will have a better performance for content routing (not necessarily content retrieval) compared with bitswap+http.

}

// New returns a BitSwapNetwork supported by underlying IPFS host.
func New(pstore peerstore.Peerstore, bitswap BitSwapNetwork, http BitSwapNetwork) BitSwapNetwork {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think bitswap+http will beat bitswap+libp2p at pure content retrieval (e.g you already know the address of the provider thank to DHT or IPNI), but for content routing+retrieval (broadcasting WANT_HAVE) bitswap+libp2p is expected to find+fetch the content faster than bitswap+http.

Maybe http+bitswap could prioritize sending WANT_BLOCK over WANT_HAVE, since the client sends WANT_BLOCK only if it thinks the remote has a good probability of having the block, whereas it literally broadcast the WANT_HAVE.

@hsanjuan
Copy link
Contributor Author

Note about the two scenarios regarding peerIDs:

  • A: The HTTP endpoint is a separate, unique PeerID, different from bitswap: In this model (web3storage), the bitswap endpoint counts as a different peer altogether, and therefore the bitswap endpoint gets a separate p2p connection, wantlist etc. In this case, even if both endpoints belong to the same provider, they will both get WANT-discovery requests and they are considered fully different providers though the system.

  • B: The HTTP endpoint is the same peerID as bitswap. Assuming the provider records contain bitswap and HTTP entries under the same peer ID (something that Kubo could do for example), the network router will prioritize HTTP for all operations. Other peers can speak bitswap to us, and our bitswap server can send responses with SendMessage(), but we will default to HTTP endpoints for wantlists, latency, pings, disconnects... This means we will not be using the Bitswap client from our side, we will not be sending bitswap traffic to other peers. Our server will still be working and responding to bitswap requests. Other-peers' bitswap servers, however, will not learn about our wantlists via bitswap if they offer an HTTP endpoint and we don't attempt to establish a libp2p connection at all. In an ideal scenario where all Kubo nodes in the network offer an HTTP endpoint, we can imagine no bitswap traffic at all.

Resolving A issues imply:

  • Identify that two peer IDs correspond to the same provider.
  • Prioritize the HTTP-peerID
  • Failover to the bitswap peerID when HTTP fails (?)
  • This needs to be done at the Routing layer possibly, but it is difficult since there is no indication that two providers are the same, other than perhaps having matching DNS.

Resolving B issues imply:

  • Ensure that bitswap server is initialized with a bitswap network, for safety: there is no reason for the bitswap server to use the network-router. DisconnectFrom() should close p2p streams, rather than wild-guessing if we should do an HTTP cleanups. (Currently DisconnectFrom() is only called from the Server, but still).
  • We could add logic to fail-over to bitswap when HTTP fails. It is easy for the initial Connect(). It is trickier when HTTP worked for some content records and errored badly for others. We need to be careful to store bitswap addresses in the peerstore, and leave them when deleting http addresses in the case of errors. There are cases as well when Connect() works but retrieval fails. How do we know that on the next Connect() we should not be attempting HTTP ? The connect/disconnect logic, including the results from the message sender needs to be fine-tuned.
  • We cannot Connect() over both bitswap and HTTP at the same time, since this can trigger competing connection-manager events, since everything else just cares about the peerID. So a bitswap failure would stop HTTP-wantlists for that peer.

This and subsequent commits introduce an httpnet module at what is known as
the "bitswap network layer". The bitswap network layer connects bitswap-peers,
sends bitswap messages and receives responses.

Bitswap messages are basically a wantlist, a list of CIDs that should be sent
if available.

httpnet does the same, except instead of sending the bitswap message over
bitswap, it triggers http requests for the requested blocks. httpnet is a
drop-in addon so that we can request blocks over http, and not only via bitswap.

As httpnet is a network, it benefits from all existing wantlist management
logic. Any http/2 endpoint should benefit from streamlined requests on a
single http connection. A router-network ensures that messages are correctly
handled by bitswap or by http requests depending on what the peers are
advertising. HTTP requests are given priority in the presence of both.

Here are some of the httpnet features:

* Peers are marked as Connected when they are able to handle http requets.
* Peers are marked as Disconnected when http requests fail repeatedly (MaxRetries).
* Server errors trigger backoffs preventing more requests to happen to the same
  url for a period (Retry-After header or configuration value)
* We support several urls per peer, meaning a peer can provide alternative
  http endpoints which are tried based on number of failures or existing cooldowns.
* We translate HAVE requests to HTTP-HEAD requests and BLOCK requests to HTTP-GETs
* We support cancellations: ongoing or soon to happen requests for a CID
  can be cancelled using a "cancel" entry in the wantlist.
* We record latency information for peers by pinging regularly.
* We discriminate between different errors so that we know whether to
  move to the next block in a wantlist, or to retry with a different url,
  or to completely abort.
* Options to configure user-agent, max retries etc. are supported.
Do not consider successful a connection attempt that returns 500.

Avoid goroutine leak when using SendMsg() many times with the same
MessageSender.
…ailable

Error counts for these statuses are tracked separately. Each 3 errors, the
serverError count increases.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants