Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Any possibility to do something like http://ix.io/4gtf? #11

Open
zenny opened this issue Nov 21, 2022 · 3 comments
Open

Any possibility to do something like http://ix.io/4gtf? #11

zenny opened this issue Nov 21, 2022 · 3 comments

Comments

@zenny
Copy link

zenny commented Nov 21, 2022

Hi,

I am referred to this package from emacs IRC. What I am trying to achieve is described in http://ix.io/4gtf with the following details:

Note Capturing from Research papers

  • Objective

Capture an annotation from a pdf and automatically create/ask to choose a note file. Prior to creating/asking a location of a note, the capture extracts the bibtex entries from online resources and create a bibfile and populate the note with bibliography tag in the note with references including page number in original document in parentheses.

  • Workflow
  1. User selects a section/annotation in a pdf document
  2. The capture mode goes online and extracts metadata
  3. The metatdata including page number of the annotation gets inserted into a either newly-created or existing note/file and bibliography of the same name created (or existing) as note and inserts
  4. The annotation gets copied to a either a new/existing file/note
  • Example
  1. I read say /Emacs as a Tool of Modern Science/ by Timothy Johnson from https://technology.matthey.com/article/66/2/122-129/ .
  2. Say I annotated /Findable, Accessible. Interoperable, Reusable (FAIR)/ from the 1st column, 1st Para, line 10-11 on page 122 (/actually page 1/)
  3. Once I select the selection, I will be asked to create anew/use existing orgfile for notetaking and subsequently creates a bibfile with the same name.
  4. Once the file name is chosen, it will authomatically searches for bibtex entries either online/offline based on the metadata of the pdf or chosen local bibfile (if any).
  5. Thereafter, the annotation and the reference (for example /[[Johnson Timothy, 2022] [pp. 122, col. 1, ln 10-11]]/ be inserted to the org note file crated above in (3) and also appends/replaces to the bibilography file crated.
  6. Where /[Johnson Timothy, 2022]/ will be exported but /[pp. 122, col. 1, ln 10-11]/ remains as a clickable reference for the researcher for future references.
  • Help
    Any help appreciated! Thanks!
@yantar92
Copy link
Owner

yantar92 commented Nov 22, 2022 via email

@zenny
Copy link
Author

zenny commented Nov 28, 2022

@yantar92

Thanks for your useful inputs.

zenny @.***> writes:

  1. I read say /Emacs as a Tool of Modern Science/ by Timothy Johnson from https://technology.matthey.com/article/66/2/122-129/ . 2. Say I annotated /Findable, Accessible. Interoperable, Reusable (FAIR)/ from the 1st column, 1st Para, line 10-11 on page 122 (/actually page 1/) 3. Once I select the selection, I will be asked to create anew/use existing orgfile for notetaking and subsequently creates a bibfile with the same name.
    This is doable. https://github.com/weirdNox/org-noter does something similar in terms of extracting the location in pdf and creating a new Org note. org-noter does not do anything Bibtex-wise though.

That is the reason I am here because org-noter does not cover bibtex stuffs. Your repo appears to bring everything under a single umbrella that is what I liked.

  1. Once the file name is chosen, it will authomatically searches for bibtex entries either online/offline based on the metadata of the pdf or chosen local bibfile (if any).
    Not many pdfs contain useful metadata.

I do agree with you. I had a hope that if one can capture the doi from the webpage using this repo?

For example, I just downloaded the paper you referenced, and it has the following: title:Johnson_Apr22 author: subject: keywords-raw: keywords: creator:Adobe InDesign 17.0 (Windows) producer:Adobe PDF Library 16.0.3 format:PDF-1.4 created:Thu Jan 20 00:07:39 2022 modified:Fri Jan 28 21:51:51 2022 Nothing useful if you want to search BibTeX entry online. The only somewhat useful approach to get DOI data from PDFs is what org-ref does in org-ref-extract-doi-from-pdf'. It simply converts the PDF to text and matches the text against org-ref-pdf-doi-regex'. Which kind of works. Sometimes. For some research journals. Sometimes it also fails or catches unrelated DOIs from references section or from extra page the journal puts into the PDF for advertisement. Actual web-pages usually contain a lot more reliable metadata. So, I usually start from the paper webpage, scrape the metadata using org-capture-ref into an Org heading, and attach the PDF to the heading. Then, paper notes are simply in the heading where the paper PDF is attached.

Thanks for the pointer.

  1. Thereafter, the annotation and the reference (for example /[[Johnson Timothy, 2022] [pp. 122, col. 1, ln 10-11]]/ be inserted to the org note file crated above in (3) and also appends/replaces to the bibilography file crated.
    This should be doable. What you need is: (1) extract PDF page info somehow (it depends on where you view the PDF); (2) find the associated heading/BibTeX entry for the paper and extract the @key; (3) Insert a citation (Org does support citations, including page references now; or you can also use org-ref). See https://orgmode.org/manual/Citation-handling.html or https://github.com/jkitchin/org-ref

I tried with org-ref, but you need to have a bibfile already created. What I am trying to achieve is to create a bibfile from the doi, and also grasp both annotation/bookmarks to either an existing or new note, and append the bibfile information at the bottom (org-ref does with bibiliography: tag as you know of.

  1. Where /[Johnson Timothy, 2022]/ will be exported but /[pp. 122, col. 1, ln 10-11]/ remains as a clickable reference for the researcher for future references.
    Citations in Org and org-ref are exported as expected and are clickable.

    -- Ihor Radchenko // yantar92, Org mode contributor, Learn more about Org mode at https://orgmode.org/. Support Org development at https://liberapay.com/org-mode, or support my work at https://liberapay.com/yantar92

@yantar92
Copy link
Owner

yantar92 commented Dec 11, 2022 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants