Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"gzip: stdin: not in gzip format" on GitHub #307

Open
vinc17fr opened this issue Jan 22, 2025 · 4 comments
Open

"gzip: stdin: not in gzip format" on GitHub #307

vinc17fr opened this issue Jan 22, 2025 · 4 comments

Comments

@vinc17fr
Copy link

When I execute w3m on GitHub, e.g. w3m https://github.com/ or just by following a link, w3m displays

gzip: stdin: not in gzip format

This occurs under both Debian/unstable (w3m 0.5.3+git20230121-2.1) and Termux/Android (w3m 0.5.3.20230121). Note that lynx and elinks do not have any such issue.

@rkta
Copy link
Contributor

rkta commented Jan 22, 2025 via email

@mcurly
Copy link

mcurly commented Jan 23, 2025

On Wed, Jan 22, 2025 at 06:12:18AM -0800, Vincent Lefèvre wrote:
When I execute w3m on GitHub, e.g. w3m https://github.com/ or just
by following a link, w3m displays

gzip: stdin: not in gzip format

This occurs under both Debian/unstable (w3m 0.5.3+git20230121-2.1) and
Termux/Android (w3m 0.5.3.20230121). Note that lynx and elinks do not
have any such issue.
GitHub is ignoring the http/1.0 request and sending an http/1.1 response
which w3m cannot parse. I'm working on fixing this in my fork at
https://git.rkta.de/w3m , but it will take me a while to do so.

As a workaround you can use a CGI script which invokes curl on the URL,
see https://rkta.de/anti-cf.html for an explanation.

Hi @rkta (Thank you for your help).

I must say that I am not keen in any way with CGI, and trying to follow up your suggestion, I wound up at another 'brick wall' w3m: Can't load https://github.com/search?q=lolcate&repo=&langOverride=&start_value=1&type=Everything&language=any.

I adapted your suggestion to what I thought was needed:

This is currently my siteconf file:

##!/bin/sh

# Circumvent Clownfare with curl

# Put this file in one of the configured cgi-bin directories of w3m and make
# it executable.

# Add the two next lines your ~/.w3m/siteconf omitting the # at the beginning
url m!^https?://github.com/!
substitute_url "file:///cgi-bin/anti-cf.cgi?"
url m!^https?://.*github.com/!
substitute_url "file:///cgi-bin/anti-cf.cgi?"
#
#printf "%s\n\n" "Content-Type: text/html"
#
#url=$(echo $W3M_CURRENT_LINK | sed 's@\(http.\{0,1\}://[^/]*\)/.*@\1@')
#curl -L ${url}/${QUERY_STRING}

and this is the anti-cf.cgi file:

#!/bin/sh

# Circumvent Clownfare with curl

# Put this file in one of the configured cgi-bin directories of w3m and make
# it executable.
# Add the two next lines your ~/.w3m/siteconf omitting the # at the beginning
#url m!^https?://github.com/!
#substitute_url "file:///cgi-bin/anti-cf.cgi?"
#url m!^https?://.*github.com/!
#substitute_url "file:///cgi-bin/anti-cf.cgi?"

printf "%s\n\n" "Content-Type: text/html"

url=$(echo $W3M_CURRENT_LINK | sed 's@\(http.\{0,1\}://[^/]*\)/.*@\1@')
curl -L ${url}/${QUERY_STRING}

Does this make sense?

Thank you beforehand!

@rkta
Copy link
Contributor

rkta commented Jan 28, 2025 via email

@mcurly
Copy link

mcurly commented Jan 28, 2025

On Thu, Jan 23, 2025 at 02:07:31PM -0800, mcurly wrote:

On Wed, Jan 22, 2025 at 06:12:18AM -0800, Vincent Lefèvre wrote:
When I execute w3m on GitHub, e.g. w3m https://github.com/ or just
by following a link, w3m displays

gzip: stdin: not in gzip format

This occurs under both Debian/unstable (w3m 0.5.3+git20230121-2.1) and
Termux/Android (w3m 0.5.3.20230121). Note that lynx and elinks do not
have any such issue.
GitHub is ignoring the http/1.0 request and sending an http/1.1 response
which w3m cannot parse. I'm working on fixing this in my fork at
https://git.rkta.de/w3m , but it will take me a while to do so.

As a workaround you can use a CGI script which invokes curl on the URL,
see https://rkta.de/anti-cf.html for an explanation.

Hi @rkta (Thank you for your help).

I must say that I am not keen on any way with cgi, and trying to
follow up your suggestion, I wound up at another 'brick wall' :sad:
w3m: Can't load https://github.com/search?q=lolcate&repo=&langOverride=&start_value=1&type=Everything&language=any.

I adapted your suggestion to what I thought was needed:
Sorry for the late response. I was on vacation and brought a flu back
home.

Here is what I have in my siteconf:

# Github does not respect our HTTP/1.0 request
url m!^https?://github.com/!
substitute_url "file:///cgi-bin/curl_GH.cgi?"

And here is my CGI script curl_GH.cgi:

#!/bin/sh

# Circumvent GitHub not respecting our HTTP/1.0 request

# Put this file in one of the configured cgi-bin directories of w3m and make
# it executable.
# Add the two next lines to your ~/.w3m/siteconf omitting the # at the beginning
#url m!^https?://github.com/!
#substitute_url "file:///cgi-bin/curl_GH.cgi?"

printf "%s\n\n" "Content-Type: text/html"

curl -L https://github.com/${QUERY_STRING}

The CGI script needs to be executable. You need to make sure that you
configure the cgi-bin directory in w3m, e.g. put the script in
~/.w3m/cgi-bin and make sure to add this directory to 'Directory
corresponding to /cgi-bin' under directory settings in the options
panel.

Hi, working fine following up your suggestions!

Thank you very much and I wish that you recover well and fast from that flu!

Cheers!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants