-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
problem with .text() drain behavior and .bodyUsed #37
Comments
At there I replied that it was discussed at w3c/ServiceWorker#452. Do you want to discuss it again? From which point? |
I'm sorry, but that thread is so long I'm having trouble determining what the outcome was. My immediate concern is from this pull request: The web-platform-test was previously doing:
When I went to fix this I was told the fetch-with-streams spec was changing to this behavior: https://critic.hoppipolla.co.uk/showcomment?chain=11816 I strongly object to this because its a complete reversal of the behavior currently shipping in fetch in both chrome and firefox. Also, it just seems completely broken from a webdev point of view. I just drained the Response but its body isn't used? I think whatever fetch-with-streams does, there should be no observable difference to code using the existing .text()/.json()/etc in current fetch. |
Cc-ing discussion participants: It is true that fetch-with-streams draft changes the behavior.
So your opinion is that |
It makes the API consistent. Also, how can we know upfront that a stream is non-empty? |
It is possible to state that |
I'm not sure what that means. |
|
Additionally, it is inconsistent with Request construction.
|
Fair, in the case of a null body (rather than empty body) we should maybe make |
(Sorry for my slow response. I'm on PTO.) Happy to see .text() on null body always succeed and not set .bodyUsed. That's a special case that we already exempt in the constructor (as you noted).
How can .text() be considered only reading part of the data? By definition .text() reads the entire body from the Request/Response. Changing it to mean anything else is not backward compatible and will break existing code. I personally prefer setting .bodyUsed as soon as the first byte is read via .body.getReader() as well, but I'm willing to bend on that one since you guys don't agree. I think its creating a footgun that will result in partial values stored in Cache API, etc, but I won't argue it further.
In general I can't see any reason why adding a stream API requires breaking backward compat on the existing API. And I have to say I'm incredibly nervous now that you are shipping this API on a release channel in a number of weeks. This breaking change was not mentioned in your "compatibility risks" in your intent to ship: Are there other non-backward compatible changes to the Fetch API in the next release? Based on my review in #25 I think the answer is "no", but it would be nice to have confirmation. Anyway, sorry to be all worked up over this. |
Also, I just want to say I'm optimistic we can sort this out. I'm sorry if my last comment came across a bit negative. I know its hard to do this kind of spec work and I really appreciate you guys taking it on. |
May be, or may not be. Without the streaming API, there is no way to read data partially, so calling
I think you are fine with #37 (comment), right? |
I would like to set When we adopt the idea that |
Let's avoid confusion around "set .bodyUsed". E.g. include "permanently" if it means making .bodyUsed return true forever, and include "while" if it means making it return true only while some condition is met (for example, force .bodyUsed return true while the stream is locked). |
@tyoshino Thanks, here I rephrase #37 (comment). I would like to set .bodyUsed at least temporarily when locked, respecting Anne's idea. When we adopt the idea that .bodyUsed is set permanently when non-empty data is read and set temporarily when locked, It looks reasonable to me that reading from an empty response doesn't set .bodyUsed. |
@wanderview Regarding
@domenic said that a stream that has been partially (not fully) consumed using the Streams API should still be able to be text()-ed in w3c/ServiceWorker#452 (comment). You should convince Domenic. |
IIRC, we were trying to re-define text(), json(), etc. using the ReadableStream when we were introducing the ReadableStream to Request/Response, while we treated the other operations such as cache.put() as special and didn't try to re-define them using the ReadableStream. We paid attention to only the state of the ReadableStream to define reasonable precondition to text(), json(), etc., and reasoned that a Request/Response with an empty stream (was originally non-empty but made to be empty by text(), json(), etc.) should be able to accept text(), json(), etc. as well as we want to allow text(), json(), etc. on a Request/Response constructed and initialized to be empty. See w3c/ServiceWorker#452 (comment) |
"Remaining data" is fine with me. It still implies EOF and bodyUsed should be true after .text() resolves, though.
Its unclear to me if pipe2 is correct. The body is not technically null there. Its an empty stream that should hit EOF immediately. In that case it seems to me that bodyUsed should be true.
I think bodyUsed should never revert from true back to false. IMO it should mean EOF.
Yea. I gave up on that one. |
I will reply to other points later, but I would point a couple of things:
This contradicts with the idea I mentioned at #37 (comment).
@wanderview , Let me confirm again: You prefer A possible fix is to define Regarding differentiating empty and null body, it's troublesome because streams doesn't have such distinction.
|
Ok, I see what you mean here. I think this shows the awkwardness of trying to equate the stream lock to the bodyUsed attribute, though. I find it terribly confusing to use an attribute that says "the body has been drained and is used" to mean "the body is temporarily unavailable". Can we make bodyUsed mean stream EOF and just make .text()/etc reject if the body is locked? It would be nice if the ReadableStream had a .locked attribute to expose this state without having to use a try/catch. Then code could check response.body.locked to detect this temporary unavailable condition. |
Oh, I missed this sentence. Yes, I think that would make sense. I think it would be nicer to make it body.locked as an attribute on ReadableStream, though.
I guess we can have an internal flag for null body. If there is no body at all provided in the Constructor then its set to true. If a stream is provided, but hits EOF without reading any actual data then its set to true. Then bodyUsed becomes Is there a clean way for Request/Response to detect if any data is actually consumed from the stream? |
This seems fine to me; not a problem to add.
I think it would be better to get rid of this distinction between null and empty body ... I do not understand what it adds. |
I think we need some kind of internal concept for it to prevent things like Cache trying to get the stream when its already EOF'd with no-data-read. Cache and other APIS need something to check to short-circuit on. |
OK, let's provide
From Streams point of view, one can do nothing with a CLOSED stream. It cannot be locked, non-empty data cannot be read from it and it cannot be errored. Hence I cannot find any point at which we can set As a fetch-streams user, I still don't understand why null vs. empty distinction is important in this use-case (I'm not saying that the distinction is generally useless). |
I would really prefer
I'm not asking to set bodyUsed to true at (F). I'm saying the Cache API needs some way to avoid an error when doing |
Why?
If it is locked it will error (and that seems like a good thing). If it is closed it will just get { done: true, value: undefined }. |
I was thinking multiple reads on closed stream would reject. I guess it will just work if:
So you're proposing just having the Request and Response constructors create an empty body stream in the case of no body being provided? |
This happens automatically. As @yutakahirano says once a stream has been read-to-end (aka closed) it is automatically unlocked.
I think so. I'm still a little confused at what the goal of bodyUsed is these days but mainly just trying to chime in to make sure the model stays coherent. This null vs. empty body things feels incoherent. |
Well, go look at how "body used" is referenced in these specs: https://slightlyoff.github.io/ServiceWorker/spec/service_worker/index.html Its essentially a safety-belt to prevent script from storing a drained Response in Cache or uploading a drained Request to a fetch PUT, etc. The previously proposed changes of unsetting bodyUsed when the stream lock was dropped broke all that. It would have also broke client code checking bodyUsed. Maintaining the definition of bodyUsed as "body EOF and body was not an empty stream" keeps all that working. If the Request/Response can detect the "body was an empty stream" condition and set bodyUsed appropriately, then I'm ok with always setting a body stream. Using null was just a convenient way of doing that previously. |
See discussion at yutakahirano/fetch-with-streams#37 (comment).
That works for me, but I recall there were objections when I asked for "immediately used" in the past. |
@yutakahirano wow, OK, that is a very interesting problem. I guess I have to ask: what if we just never set Content-Length in fetch? (I.e., always used chunked encoding.) I have heard Content-Length is untrustworthy and stuff. But, I anticipate this not being what people want. I just had to ask.
See #43 (comment). I really didn't consider Content-Length when designing streams. But in retrospect it's obviously important. You can imagine similar situations where you're dealing with a (total length, stream) pair. In those cases the offset into the stream is definitely important information. (This is also another I/O streams vs. abstract async sequences thing IMO.) bytesReadSoFar is starting to sound reasonable. (Or use the size() function to make it something that works for all readable streams.) I realize this is a complete 180, but this use case really surprised me. If people have other convincing arguments or scenarios where bytesReadSoFar is important---preferably not for bodyUsed-style warnings, but instead of stuff like Content-Length---it would definitely help tip me over the edge. |
My thinking was that if you upload using a stream, you'd always get chunked. Fetch needs a bunch of tweaks for that, but should be doable. |
Sure, but the point of #37 (comment) is that you're always "uploading using a stream", even if you pass in e.g. a string as the request body. |
Yeah, so my suggestion for that case is that you can't do that. If you're using the As to your question about |
Did short analysis on What kind of
|
Will the |
Anne, #37 (comment)
Does "that" here mean reading some part of the fixed string from |
Yes. Basically, once you use a request/response's stream for A, you can no longer use it for B. |
I guess that is OK. At least it is simple. We are basically saying "don't touch
Sorry, I simply meant using the queuing strategy size() method to calculate "sizeSoFar" or something, instead of making it byte-specific. So every time something is read (dequeued) we not only change the value of
I did not anticipate it being resettable. I am not sure how that impacts (3) of The motivation for this sizeSoFar is basically the idea that streams with a total length are probably a common pattern. Maybe HTTP is a special case though---e.g. file streams often do not have total sizes, since you just open the stream instead of stat'ing first. People seem less excited about this idea than I initially was. |
Added a new item (7) based on the concern yutakahirano has been discussing to #37 (comment) |
Yet another alternative "ReadableByteStream observing API" posted at whatwg/streams#362 |
I see, sorry!
OK. So, we store the Content-Length separate from the stream. At which point should we guarantee sizeReadSoFar gets updated? Valid while not locked? |
Another thought I had about marking "used at end of stream" instead of "used on first byte read": In many cases a Response is effectively read-only. It does not allow modifications. However, letting code read a portion of the stream and then storing it in Cache or passing it on effectively is a modification. Its a deletion of data from the body. Is that ok in these explicitly read-only cases? @annevk? |
It's not clear to me what that buys us. Seems more confusing. |
Wrote a sketch of Domenic's proposal of |
Please take a look at analysis of requirements for the |
I'm fine with that. |
I'm not convinced this can be used for |
This is not necessarily about developer-controlled streams. This is about e.g. getting a Request from the cache, reading a few bytes from it, then trying to upload (which I believe is supposed to be disallowed). |
Right. I think we should disallow that. I think @annevk is suggesting we should set the used flag immediately on getReader(). It sounds like you still want to set used based on offset somehow. Is that correct? |
@annevk Wrote a concrete algorithm at whatwg/streams#367 (comment). This doesn't change anything for the problem how to generate the Content-Length header for developer-controlled streams, but addresses the "drained" detection as Domenic just said. |
Another point: We need to track
#43 has no problem, because it sets used flag when locked. |
We will use |
Finally, this came to an agreement and fix has been made on the spec! |
Over here I raised a concern that the
.bodyUsed
flag was not getting set properly for.text()
.#25 (comment)
Can someone explain how .text() can not set the
.bodyUsed
without requiring the browser to cache the entire contents of the stream in case someone calls it again?And if the answer is "the code will hit the locked reader", it seems we still can't just break the meaning of bodyUsed on a shipped spec.
The text was updated successfully, but these errors were encountered: