-
Notifications
You must be signed in to change notification settings - Fork 38.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Streaming parsing of JSON array in Spring WebClient #24951
Comments
Here's a very quick&dirty implementation of the BodyExtractor implementation: https://gist.github.com/HaloFour/ce3063d4e693b495e3c194cbb2f66686 The actual token parsing could certainly be cleaned up but it gets the job done at least to the extent that existing integration tests in the project are passing. |
Also, not to pile up additional requests in a single issue, but I didn't see a way to use a |
@HaloFour thanks for the proposal.This looks feasible and probably worth doing but mainly I'm wondering about what a more general solution looks like and how much more general does it need to be. For example the case of multiple arrays such as in #21862. We could accept multiple JSON pointers but it's less obvious how to represent the output which logically is |
Thanks for this, @HaloFour ! Looks like something I was looking for (hence #25472). I'll give your Gist a try. @rstoyanchev (just reiterating from #25472 ) I think it makes sense to focus on the most common case of a single array of a single type of object in the JSON response. The semantics of anything else, like you explain, becomes very hairy very quickly and the applicability of it seems low for most real world scenarios (imho). |
Thanks for taking a look! Here's a newer Gist based on the code that we're currently using in production. |
Yes it make sense to do something that would solve many cases. That said other possible cases are not that far to see. Take for example #21862 or even for Elasticsearch isn't it necessary sometimes to access something else besides the hits, like "search_after"? |
going back to the original question, with the new API, exactly how do we extract the |
|
for the original use case of json-pointing to an array in order to stream-parse it, I think it would be better to delegate that responsibility to Jackson and probably just offer an lightweight Unfortunately, even though in Jackson-Core there is a @HaloFour maybe there's an opportunity to contribute something there? |
Sure, I can take a look at that. |
I found #21862 which is pretty close to my request but closed.
I am currently using Spring WebClient with Spring Boot 2.2.6 and Spring Framework 5.2.5 writing a service that sits in front of a number of other upstream services and transforms their response for public consumption. Some of these services respond with very large JSON payloads that are little more than an array of entities wrapped in a JSON document, usually with no other properties:
There could be many thousands of entities in this nested array and the entire payload can be tens of MBs. I want to be able to read in these entities through a
Flux<T>
so that I can transform them individually and write them out to the client without having to deserialize all of them into memory. This doesn't appear to be something that Spring WebFlux supports out of the box.I'm currently exploring writing my own BodyExtractor which reuses some of the code in
Jackson2Tokenizer
to try to support this. My plan is to accept aJsonPointer
to the location of the array and then parse asynchronously until I find that array, then to buffer the tokens for each array element to deserialize them.Before I go too far down this path I was curious if this was functionality that Spring would be interested in supporting out of the box.
Similarly, I was curious about the functionality of being able to stream out a response from a WebFlux controller via a
Flux<T>
where the streamed response would be wrapped in a JSON array and possibly in a root JSON document as well?The text was updated successfully, but these errors were encountered: