Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

let's see what's new in this fork of the warc lib #22

Open
wants to merge 28 commits into
base: master
Choose a base branch
from

Conversation

nlevitt
Copy link

@nlevitt nlevitt commented May 10, 2016

No description provided.

Anand Chitipothu and others added 28 commits November 15, 2014 13:11
Manual addition of some bugfixes found in upstream issues and pull
requests.
Almost certainly broken.
Numerous small changes. Update for python3, attempt to remove gzip2 and
use standard library instead. Began creating small tool for HTTP
parsing. Probobly utterly broken.
Added warcscrape.py and supporting files.
gzip.open() expects a string as filename. In the prior implementation
it passed a fileobject as argument. This resulted in a type error.
Changing it to pass the name of the file as argument to gzip.open()
fixes this problem.
  The old approach assumed to much about the internals of requests and ended up with an
  empty body, it also mixed byte encoded content with regular python strings (the latter
  was the intention so I adapted the code to that).
…aw bytes instead.

  This simplifies the code and fixes an alignment bug. It also moves the decoding responsibility
  the the user to the user of the library.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants