-
Notifications
You must be signed in to change notification settings - Fork 34
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Opens gzip compressed content #513
Conversation
- follows redirects - fixes misconception of "Content-Encoding"
#513) Also, group constants.
The `Accept-Charset` request header value is generally not suitable as a charset name. Should be applied to compressed content as well.
Apply connect timeout to all requests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This pull request does a lot, and in a single commit to boot.
- Handling compressed content: Generally 👍,
but maybe a generic "decompressor" module would be more versatile? - Following redirects:
HttpUrlConnection
does this by default, why replace it with a custom implementation? - Fixing "encoding" vs. "charset": Good catch, 👍
I've tried to address most of the issues I have with this pull request in individual commits. Please review.
I'm approving the amended pull request for now, despite my hesitation towards the custom follow logic.
Please note that there are backwards incompatible changes here, so it would probably necessitate a major release.
metafacture-io/src/main/java/org/metafacture/io/HttpOpener.java
Outdated
Show resolved
Hide resolved
private static final Pattern HEADER_FIELD_SEPARATOR_PATTERN = Pattern.compile(HEADER_FIELD_SEPARATOR); | ||
private static final Pattern HEADER_VALUE_SEPARATOR_PATTERN = Pattern.compile(HEADER_VALUE_SEPARATOR); | ||
|
||
private static final int ALLOWED_REDIRECTIONS = 3; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this value be configurable?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would say: let's wait and implement if need arises. Would you be ok with this?
private static final Pattern HEADER_FIELD_SEPARATOR_PATTERN = Pattern.compile(HEADER_FIELD_SEPARATOR); | ||
private static final Pattern HEADER_VALUE_SEPARATOR_PATTERN = Pattern.compile(HEADER_VALUE_SEPARATOR); | ||
|
||
private static final int ALLOWED_REDIRECTIONS = 3; | ||
private static final int CONNECTION_TIMEOUT = 11000; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How did you arrive at this value? 11 seconds seems kind of arbitrary.
Should this value be configurable?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's kind of arbitrary. Setting these values at least easily prevent possible infinite loops.
Re configurable: I would say: let's wait and implement if need arises. Would you be ok with this?
metafacture-io/src/main/java/org/metafacture/io/HttpOpener.java
Outdated
Show resolved
Hide resolved
metafacture-io/src/main/java/org/metafacture/io/HttpOpener.java
Outdated
Show resolved
Hide resolved
metafacture-io/src/test/java/org/metafacture/io/HttpOpenerTest.java
Outdated
Show resolved
Hide resolved
metafacture-io/src/main/java/org/metafacture/io/HttpOpener.java
Outdated
Show resolved
Hide resolved
metafacture-io/src/main/java/org/metafacture/io/HttpOpener.java
Outdated
Show resolved
Hide resolved
metafacture-io/src/test/java/org/metafacture/io/HttpOpenerTest.java
Outdated
Show resolved
Hide resolved
Unfortunately HttpURLConnection only follows redirects if they use the same protocol. As this was the initial issue (unable to open http://schema.org/) a custom implementation (or other library) was needed. Thx for the many improvements/rewritings. Also agreed on multiple commits vs one big commit. |
Unfortunately, this motivation wasn't made explicit. Both this pull request and the referenced issue are primarily concerned with compression. Wouldn't it have been possible to adjust the URL being opened client-side?
Is this custom implementation affected by the same concerns then? ("serious security consequences") And what about the objections against |
Sorry to not have been explicit in the issue. I will remember to do better. "serious security consequences": java network developers decided to treat protocol switches as a possible threat. We don't. Re objections: you mean https://stackoverflow.com/questions/1884230/httpurlconnection-doesnt-follow-redirect-from-http-to-https/26046079#comment105077439_26046079 ? Is the |
Was this discussed anywhere?
Of course it's needed, that's where the redirect URL comes from. The objection was against decoding that URL.
That's because you only added a test for the compression handling, not for the redirect logic. I missed it in my review. |
Resolves #511.