Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Storlets do not work on recent versions of swift #16

Open
hroumani opened this issue Aug 5, 2015 · 0 comments
Open

Storlets do not work on recent versions of swift #16

hroumani opened this issue Aug 5, 2015 · 0 comments

Comments

@hroumani
Copy link
Contributor

hroumani commented Aug 5, 2015

Storlets no longer work on recent versions of swift (last working version is 2.3.0). As a quick summary, storlets send a content length of 0 (as they cannot tell with the storlet will do with the output stream) but swift does not like this.

Details:
So I dug into this a little more just to see the code details, and when exactly this was changed:

This takes place while we're trying to build an iterator over the response:

... GETorHEAD_base -> get_working_response -> _make_app_iter

    parts_iter = self._get_response_parts_iter(req, node, source)

    def add_content_type(response_part):
        response_part["content_type"] = \
            HeaderKeyDict(response_part["headers"]).get("Content-Type")
        return response_part


    return document_iters_to_http_response_body(
        (add_content_type(pi) for pi in parts_iter),            
        boundary, is_multipart, self.app.logger)

So in the return you can see we call document_iters_to_http_response_body which does, the path in question is for none multi-part requests as things are handled differently for this):

document_iters_to_http_response_body
if multipart:
return document_iters_to_multipart_byteranges(ranges_iter, boundary)
else:
try:
response_body_iter = next(ranges_iter)['part_iter'] <<<<<<<<

When we call 'next' above on the iterator, is where we run into trouble, we created the iterator earlier though parts_iter = self._get_response_parts_iter(req, node, source), which will trigger this code:

_get_response_parts_iter:
# This is safe; it sets up a generator but does not call next()
# on it, so no IO is performed.
parts_iter = [
http_response_to_document_iters(
source[0], read_chunk_size=self.app.object_chunk_size)]

Finally, the call to http_response_to_document_iters uses the content-length (for none-multi part requests) to create the iterator:

http_response_to_document_iters:
if response.status == 200:
# Single "range" that's the whole object
content_length = int(response.getheader('Content-Length'))
return iter([(0, content_length - 1, content_length,
response.getheaders(), response)])

So the code in 2.3.0 is quite different, creating the iterator (i.e. _make_app_iter) doesn't call http_response_to_document_iters at all, rather all the logic simply existed in this function directly and it seems they created a generator using yield by using 'read.

I've tracked down the change to:
openstack/swift@4f2ed8b
https://review.openstack.org/#/c/173497/

EC: support multiple ranges for GET requests
GetOrHeadHandler got a base class extracted from it that treats an
HTTP response as a sequence of byte-range responses. This way, it can
continue to yield whole fragments, not just N-byte pieces of the raw
HTTP response, since an N-byte piece of a multipart/byteranges
response is pretty much useless.
...

Also, the MIME response for replicated objects got tightened up a
little. Before, it had some leading and trailing CRLFs which, while
allowed by RFC 7233, provide no benefit. Now, both replicated and EC
multipart/byteranges avoid extraneous bytes. This let me re-use the
Content-Length calculation in swob instead of having to either hack
around it or add extraneous whitespace to match.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant