-
Notifications
You must be signed in to change notification settings - Fork 104
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Can't save HTML file attachments #75
Comments
I've just pushed v1.2.11 to PyPI, which should allow you to dump HTML snapshots. Note that these are zip archives, so they'll need to be decompressed first. If you use the Let me know if this works for you – retrieval of arbitrary file attachments is supported for most common types, but I overlooked this use case. Sorry! |
OK, this works better. But if dump() does not adhere to the file name given, and if it does not actually return the resulting filename, how is the calling code supposed to know what the file is? Generally, if "filename" is "foo.html", it would make much more sense to unzip it right there and then in file(). Calling code could check the new "snapshot" variable, but is this meant to be public? I'm not using dump() by the way because it isn't present in older versions of Pyzotero. |
One possibility is dumping the snapshot contents into folders with the same name as their item key, which is predictable and easy to document. (the snapshot variable is not meant to be public, and is going away again) |
This causes tests related to #75 to pass again
In that case, I would suggest that dump() will return the file name it has chosen (if done automatically) , maybe a fully qualified path.
I would also provide a function that gives the file name that is suggested when calling file(), or a function that will return the MIME type of the file that is output, i.e., application/gzip. The contentType does not seem to have the right type.
The behavior right now seems odd: a file of unclear type is written, and one has to check the “snapshot” variable and append .gz to the filename if it is True.
If you were to unpack it, I would unpack into a folder named foo.html if foo.html is the filename.
Note that any file name can overwrite existing ones - there is no guarantee it is unique.
In Zot_bib_web, I create a folder named after the key, and all attachments go inside that folder. As long as the attachments have distinct names it’s good.
… On May 17, 2017, at 7:50 PM, Stephan Hügel ***@***.***> wrote:
dump() will adhere to a file name if it's given, which implies that the user is familiar with the snapshot format, and that it's compressed (which I now also call out in the docs). As for unzipping automatically:
• I could unzip the contents in the specified (or working) dir. That's going to cause difficulties, because some snapshot components are generically-named (item.css), so things will be overwritten if you're dumping several snapshots at once.
• dump the contents into a newly-created folder under the specified or default path. But what to call it? The file returned by the API is always item.html
One possibility is dumping the snapshot contents into folders with the same name as their item key, which is predictable and easy to document.
(the snapshot variable is not meant to be public, and is going away again)
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub, or mute the thread.
|
It turns out that in the case of snapshots, the |
As for the Zotero API, once the MIME type is fixed, this makes sense. It's good to know this filename. |
I don't follow. To which file name are you referring? |
dump() decides about the filename for the file it creates (by default). Either with, or without ".zip". The calling code will want to do something with it. In my case, I want to make a URL from it, to link to this file, or maybe I will unzip it. Rather than recreating the logic that is in PyZotero (and which may be updated in the future), I'm suggesting that PyZotero's dump() function returns the file name it has created. Am I missing something? |
Now I get it. Yes, that's a good approach. I'm going to leave this open until the MIME change has happened and I've landed the change. |
This change also alters dump() to return the path and file name, see discussion in #75
This causes tests related to #75 to pass again
This change also alters dump() to return the path and file name, see discussion in #75
This causes tests related to #75 to pass again
This change also alters dump() to return the path and file name, see discussion in #75
This causes tests related to #75 to pass again
This change also alters dump() to return the path and file name, see discussion in #75
This causes tests related to #75 to pass again
This change also alters dump() to return the path and file name, see discussion in #75
This causes tests related to #75 to pass again
This change also alters dump() to return the path and file name, see discussion in #75
This causes tests related to #75 to pass again
This change also alters dump() to return the path and file name, see discussion in #75
This causes tests related to #75 to pass again
This change also alters dump() to return the path and file name, see discussion in #75
This causes tests related to #75 to pass again
This change also alters dump() to return the path and file name, see discussion in #75
This causes tests related to #75 to pass again
This change also alters dump() to return the path and file name, see discussion in #75
This causes tests related to #75 to pass again
This change also alters dump() to return the path and file name, see discussion in #75
This causes tests related to #75 to pass again
This change also alters dump() to return the path and file name, see discussion in #75
This causes tests related to #75 to pass again
This change also alters dump() to return the path and file name, see discussion in #75
This causes tests related to #75 to pass again
This change also alters dump() to return the path and file name, see discussion in #75
This causes tests related to #75 to pass again
This change also alters dump() to return the path and file name, see discussion in #75
This causes tests related to #75 to pass again
This change also alters dump() to return the path and file name, see discussion in #75
This causes tests related to #75 to pass again
This change also alters dump() to return the path and file name, see discussion in #75
This causes tests related to #75 to pass again
This change also alters dump() to return the path and file name, see discussion in #75
This causes tests related to #75 to pass again
This change also alters dump() to return the path and file name, see discussion in #75
This causes tests related to #75 to pass again
This change also alters dump() to return the path and file name, see discussion in #75
This causes tests related to #75 to pass again
This change also alters dump() to return the path and file name, see discussion in #75
Could support for arbitrary files, or at least HTML attachments be added?
Content type is contentType': u'text/html', and calling the "file" method on this attachment item produces errors that vary with python version, e.g. "UnicodeEncodeError: 'ascii' codec can't encode characters in position 11-12: ordinal not in range(128)", or a ValueError in the latest PyZotero.
Would it make sense to fail gracefully or throw a documented exception?
File "./zot.py", line 1584, in dumpFiles f.write(self.zot.file(item.key)) File "/usr/local/lib/python2.7/site-packages/pyzotero/zotero.py", line 187, in wrapped_f return retrieved.json() File "/usr/local/lib/python2.7/site-packages/requests/models.py", line 819, in json return json.loads(self.text, **kwargs) File "/usr/local/Cellar/python/2.7.10/Frameworks/Python.framework/Versions/2.7/lib/python2.7/json/__init__.py", line 338, in loads return _default_decoder.decode(s) File "/usr/local/Cellar/python/2.7.10/Frameworks/Python.framework/Versions/2.7/lib/python2.7/json/decoder.py", line 366, in decode obj, end = self.raw_decode(s, idx=_w(s, 0).end()) File "/usr/local/Cellar/python/2.7.10/Frameworks/Python.framework/Versions/2.7/lib/python2.7/json/decoder.py", line 384, in raw_decode raise ValueError("No JSON object could be decoded")
The text was updated successfully, but these errors were encountered: