Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sctp: allow partial reads #304

Closed
wants to merge 6 commits into from
Closed

sctp: allow partial reads #304

wants to merge 6 commits into from

Conversation

melekes
Copy link
Contributor

@melekes melekes commented Sep 30, 2022

This PR modifies read behaviour to permit partial reads and no longer return ErrShortBuffer if the buffer is too small to fit a whole message. The buffer will be filled up to its size and the rest of the msg will be returned upon the next call to read.

Closes #273

removes `ErrShortBuffer`

Refs #273
@codecov
Copy link

codecov bot commented Sep 30, 2022

Codecov Report

Base: 56.52% // Head: 56.58% // Increases project coverage by +0.06% 🎉

Coverage data is based on head (30a8275) compared to base (c9409ba).
Patch coverage: 53.33% of modified lines in pull request are covered.

Additional details and impacted files
@@            Coverage Diff             @@
##           master     #304      +/-   ##
==========================================
+ Coverage   56.52%   56.58%   +0.06%     
==========================================
  Files         500      500              
  Lines       47434    47534     +100     
  Branches    12840    12859      +19     
==========================================
+ Hits        26811    26898      +87     
- Misses       9948     9956       +8     
- Partials    10675    10680       +5     
Impacted Files Coverage Δ
sctp/src/association/association_test.rs 46.98% <0.00%> (-0.05%) ⬇️
sctp/src/error.rs 41.66% <ø> (ø)
sctp/src/queue/queue_test.rs 59.60% <9.09%> (-0.49%) ⬇️
sctp/src/queue/reassembly_queue.rs 86.91% <76.66%> (-1.30%) ⬇️
media/src/lib.rs 37.93% <0.00%> (-4.38%) ⬇️
.../src/io/sample_builder/sample_sequence_location.rs 97.29% <0.00%> (-2.71%) ⬇️
ice/src/agent/agent_gather_test.rs 55.83% <0.00%> (-0.32%) ⬇️
...tc/src/peer_connection/peer_connection_internal.rs 54.06% <0.00%> (-0.05%) ⬇️
util/src/vnet/conn/conn_test.rs 52.94% <0.00%> (ø)
... and 7 more

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report at Codecov.
📢 Do you have feedback about the report comment? Let us know in this issue.

@melekes melekes self-assigned this Oct 3, 2022
@melekes melekes marked this pull request as ready for review October 3, 2022 13:08
@melekes
Copy link
Contributor Author

melekes commented Oct 4, 2022

✅ tested with webrtc libp2p transport (all tests are passing)

@k0nserv
Copy link
Member

k0nserv commented Oct 5, 2022

@melekes did you review the callsites where data channels use SCTP?

@melekes
Copy link
Contributor Author

melekes commented Oct 5, 2022

@melekes did you review the callsites where data channels use SCTP?

Yes

@KallDrexx
Copy link

Maybe I am missing something, but this seems like a silently breaking change?

read_data_channel() calls read_sctp() and this change seems to imply that we may now get a partial read. There's no indication that each each read_sctp() call is a partial or full message.

From what I can tell, WebRTC data channels are meant to be message based, not stream based, and so it seems like this like a single logical message will now trigger multiple DataChannelMessage events to propagate with the data split between them.

Am I missing something?

@melekes
Copy link
Contributor Author

melekes commented Oct 6, 2022

Maybe I am missing something, but this seems like a silently breaking change?

It is a breaking change (if the given buffer is not long enough to fit the message).

However, the size of the buffer is capped at u16::MAX and it's max for Chrome.

/// message size limit for Chromium
const DATA_CHANNEL_BUFFER_SIZE: u16 = u16::MAX;

So if you're using the default (i.e. read_loop), then you should be fine.

If you're calling data_channel.read with a smaller buffer, then the behavior is indeed different now.

and so it seems like this like a single logical message will now trigger multiple DataChannelMessage events to propagate with the data split between them.

It won't (because of u16::max).

From what I can tell, WebRTC data channels are meant to be message based, not stream based,

Does this mean we can't make this change?

@melekes
Copy link
Contributor Author

melekes commented Oct 6, 2022

After giving it a little bit of thought (and reading https://lgrahl.de/articles/demystifying-webrtc-dc-size-limit.html), I think from the Rust perspective it would still be correct to fill the buffer in case of read (because other Reader's do it and that's what every developer expects) BUT we should also indicate that message is not fully read somehow (maybe adding Result<usize, bool> boolean flag to indicate that). What do you think @k0nserv @KallDrexx ?

@k0nserv
Copy link
Member

k0nserv commented Oct 6, 2022

It feels a bit dubious to mess with the message based nature of Data Channels. As it would be a departure from the specification. Is there a different way to work around the problem you have @melekes?

If you control both sides‚ message size can be known and buffers always appropriately sized

@melekes
Copy link
Contributor Author

melekes commented Oct 6, 2022

message size can be known

not always!

As it would be a departure from the specification.

can you point me to the exact place?

Is there a different way to work around the problem you have @melekes?

yes. create a very big buffer, which is not ideal to say the least.

@melekes
Copy link
Contributor Author

melekes commented Oct 6, 2022

I see two problems here:

  • if you don't know the size of the biggest message, you will try to read it and fail (ErrShortBuffer). The message will be lost. Unless resubmitted by the remote.
  • if you have two messages (1MB and 200MB in size), you are forced to always create a buffer of size 200MB, right?

With this PR, you'll never face both problems.

@KallDrexx
Copy link

KallDrexx commented Oct 6, 2022

With this PR, you'll never face both problems.

I could be mistaken, but while this PR as written fixes those two problems it creates new problems.

Specifically, we send protobuf messages across data channels from the browser. We had an issue where the message we were trying to send ended up being significantly large in some circumstances which caused the ErrShortBuffer error to propogate, which ended up in our code tearing down the connection and raising an error. This was good for us because we decided that the large data messages was a design flaw in our protocol and worked around it so we didn't need such large data messages (yes I understand this is not always feasible without manual splitting).

If your PR was in at the time, unless I'm missing something this would have been significantly harder to figure out what was going wrong, because partial reads would have triggered multiple data channel messages, which mean all code that expects a single data channel message to be a single deserializable self contained message would have failed. We would have gotten no errors or warnings and just random circumstances of deserialization failures.

Maybe I'm wrong and I'm just missing the code which would assemble partial reads into a single DataChannelMessage buffer though.

For what it's worth, the data channels spec itself in section 6.6 recommends a message size limit of 16K. Specifically:

As long as message interleaving is not supported, the sender SHOULD limit the maximum message size to 16 KB to avoid monopolization.

@KallDrexx
Copy link

KallDrexx commented Oct 6, 2022

if you have two messages (1MB and 200MB in size), you are forced to always create a buffer of size 200MB, right?

Maybe this is better solved by utilizing the bytes crate better. Specifically,

  1. Make the buffer a BytesMut
  2. set the size to whatever size we want to do each individual read at
  3. Read into the BytesMut. If we read to the full capacity and there's more, extend the buffer and do another read
  4. If we've red less than the full capacity of the individual read buffer, consider the message complete, split and freeze the buffer and return that as the message, and retain the BytesMut for the next read attempt.

Specifically, this gets around the issue because while it may allocate a 200MB size buffer to store the one large message, the buffer should be efficiently re-used by bytes crate innate ability to re-use no longer used sections of the buffer.

@melekes melekes closed this Oct 7, 2022
@melekes
Copy link
Contributor Author

melekes commented Oct 7, 2022

Maybe I'm wrong and I'm just missing the code which would assemble partial reads into a single DataChannelMessage buffer though.

What if do this? Would it be strictly better than throwing an error if msg is too big?

@melekes
Copy link
Contributor Author

melekes commented Oct 7, 2022

Read into the BytesMut. If we read to the full capacity and there's more, extend the buffer and do another read

huh? in the current implementation you'll just get ErrShortBuffer if you "read to the full capacity and there's more".

@KallDrexx
Copy link

What if do this? Would it be strictly better than throwing an error if msg is too big?

100% yes. As long as a single DataChannelMessage is always guaranteed to be the complete message that was sent I think we are in the clear.

read_sctp() should probably always return a complete message as well based on my very quick look at the SCTP specification, because I think that anyone that uses the SCTP crate without webrtc-rs would probably expect that to be the case (but I don't know enough about SCTP to say for sure).

Which seems like it leaves two options that I can see (though there are probably others)

The first option is to have read_rtcp() and the reassembly queue take in a &mut BytesMut. If the buffer is too small the assembly queue just resizes the buffer to accomodate the chunks. When read_rtcp() returns with a non-error enumeration in data_channel::read_loop(), the code will call buffer.split() to siphon off the parts of the buffer that were written to, return that as the data channel message, and now have an empty BytesMut that can be passed in for the next sctp read.

The downside of this that I see is that some devs may not like the unpredictability of allocations.

The second option I see is the reassembly queue and read_sctp() functions don't return a usize but instead return a

enum SctpReadStatus {
 Complete(usize),
 Incomplete(usize)
}

If the buffer is to short, it still has the partial read in the buffer but returns SctpReadStatus::Incomplete signalling that there is more to read.

In data_channel::read_loop() you now have two buffers:

let mut immediate_buffer = vec![0u8; DATA_CHANNEL_BUFFER_SIZE as usize];
let mut message_buffer = BytesMut::new();

When read_data_channel() returns a complete or incomplete message, it copies the data from the immediate_buffer to the message_buffer. If the message was complete it returns DataChannelMessage { data: message_buffer.split()}`. This leaves the message buffer ready for the next message and the immediate buffer ready for the next read.

IF an incomplete message was passed in, instead of returning the data channel message we just continue the loop until we receive a complete status response.

This is actually more efficient than the code is now with a quick look. Right now in data_channel::read_loop() we are doing Bytes::from(buffer[..n].to_vec()) which means that we are allocating a new vector for every data channel message, just to get a bytes. However, that's the whole purpose of BytesMut, which is to allow those byte allocations to be re-used unless growth is really needed (this can impact tail latencies when allocations are slow).

@melekes melekes reopened this Oct 7, 2022
@melekes melekes closed this Oct 10, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[SCTP] Stream::read: partial read OR returning the expected size of the buffer
3 participants