DEPRECATED: All this work was done. Working on revision now.
On Linux:
- Extract chapter or pages:
$ pdftk infile.pdf cat 12-15 output outfile.pdf
- Convert to TXT or directly to MD:
$ pdftotext outfile.pdf outfile.txt
A good python alternative with better results:
$ pdf2txt.py outfile.pdf > outfile.txt
Convert PDF directly to MD: PDF to Markdown Converter (source in GitHub). Very good, recommended.
-
In both cases, fix broken paragraphs: paragrapher.
-
Copy txt inside the right chapter:
echo "" >> chapter.md
cat outfile.txt >> chapter.md
- Edit with Ghostwriter or other editor.