Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature footnote support #1392

Open
wants to merge 7 commits into
base: develop
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
40 changes: 40 additions & 0 deletions features/doc-access-footnotes.feature
Original file line number Diff line number Diff line change
@@ -0,0 +1,40 @@
Feature: Access document footnotes
In order to operate on an individual footnote
As a developer using python-docx
I need access to each footnote in the footnote collection of a document
I need access to footnote properties

Scenario: Access footnote from a document containing footnotes
Given a document with 3 footnotes and 2 default footnotes
Then len(footnotes) is 5
And I can access a footnote by footnote reference id
And I can access a paragraph in a specific footnote

Scenario: Access a footnote from document with an invalid footnote reference id
Given a document with footnotes
When I try to access a footnote with invalid reference id
Then it trows an IndexError

Scenario Outline: Access footnote properties
Given a document with footnotes and with all footnotes properties
Then I can access footnote property <propName> with value <value>

Examples: footnote property names and values
| propName | value |
| footnote_position | str('pageBottom') |
| footnote_number_format | str('lowerRoman') |
| footnote_numbering_start_value | int(1) |
| footnote_numbering_restart_location | str('continuous') |

Scenario Outline: Access footnotes and footnote properties in a document without footnotes
Given a document without footnotes
# there are always 2 default footnotes with footnote reference id of -1 and 0
Then len(footnotes) is 2
And I can access footnote property <propName> with value <value>

Examples: footnote property names and values
| propName | value |
| footnote_position | None |
| footnote_number_format | None |
| footnote_numbering_start_value | None |
| footnote_numbering_restart_location | None |
21 changes: 21 additions & 0 deletions features/doc-set-footnote-props.feature
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
Feature: Set footnote properties
In order to change footnote properties of a document
As a developer using python-docx
I need a setter for footnote properties

Scenario Outline: Change footnote properties
Given a document with footnotes and with all footnotes properties
When I change footnote property <propName> to <value>
Then I can access footnote property <propName> with value <value>

Examples: footnote property names and values
| propName | value |
| footnote_position | str('beneathText') |
| footnote_position | str('pageBottom') |
| footnote_number_format | str('upperRoman') |
| footnote_number_format | str('decimal') |
| footnote_number_format | str('hex') |
| footnote_numbering_start_value | int(10) |
| footnote_numbering_restart_location | str('eachPage') |
| footnote_numbering_restart_location | str('eachSect') |
| footnote_numbering_restart_location | str('continuous') |
15 changes: 15 additions & 0 deletions features/par-access-footnotes.feature
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
Feature: Access paragraph footnotes
In order to operate on an individual footnote
As a developer using python-docx
I need access to every footnote if present in s specific paragraph


Scenario Outline: Access all footnote text from a paragraph that might contain a footnote
Given a document with paragraphs[0] containing one, paragraphs[1] containing none, and paragraphs[2] containing two footnotes
Then paragraphs[<parId>] has footnote reference ids of <refIds>, with footnote text <fText>

Examples: footnote values per paragraph
| parId | refIds | fText |
| 0 | int(1) | str(' This is footnote text for the first footnote.') |
| 1 | None | None |
| 2 | [2,3] | [' This is footnote text for the second footnote.', ' This is footnote text for the third footnote.'] |
23 changes: 23 additions & 0 deletions features/par-insert-footnote.feature
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
Feature: Insert a footnote at the end of a paragraph
In order to add new footnote at the end of a text (paragraph)
As a developer using python-docx
I need a way to add a footnote to the end of a specific paragraph


Scenario: Add a new footnote to a paragraph in a document without footnotes
Given a paragraph in a document without footnotes
When I add a footnote to the paragraphs[1] with text ' NEW FOOTNOTE'
Then the document contains a footnote with footnote reference id of 1 with text ' NEW FOOTNOTE'
And len(footnotes) is 3

Scenario Outline: Add a new footnote to a paragraph in a document containing one footnote before the paragraph and two footnote after
Given a document with paragraphs[0] containing one, paragraphs[1] containing none, and paragraphs[2] containing two footnotes
When I add a footnote to the paragraphs[1] with text ' NEW FOOTNOTE'
Then paragraphs[<parId>] has footnote reference ids of <refIds>, with footnote text <fText>
And len(footnotes) is 6

Examples: footnote values per paragraph
| parId | refIds | fText |
| 0 | int(1) | str(' This is footnote text for the first footnote.') |
| 1 | int(2) | str(' NEW FOOTNOTE') |
| 2 | [3,4] | [' This is footnote text for the second footnote.',' This is footnote text for the third footnote.'] |
171 changes: 171 additions & 0 deletions features/steps/footnotes.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,171 @@
"""Step implementations for footnote-related features."""

from behave import given, then, when
from behave.runner import Context

from docx import Document
from docx.footnotes import Footnote
from docx.text.paragraph import Paragraph

from helpers import test_docx

# given ====================================================


@given("a document with 3 footnotes and 2 default footnotes")
def given_a_document_with_3_footnotes_and_2_default_footnotes(context: Context):
document = Document(test_docx("footnotes"))
context.footnotes = document.footnotes


@given("a document with footnotes and with all footnotes properties")
def given_a_document_with_footnotes_and_with_all_footnotes_properties(context: Context):
document = Document(test_docx("footnotes"))
context.section = document.sections[0]


@given("a document with footnotes")
def given_a_document_with_footnotes(context: Context):
document = Document(test_docx("footnotes"))
context.footnotes = document.footnotes


@given("a document without footnotes")
def given_a_document_without_footnotes(context: Context):
document = Document(test_docx("doc-default"))
context.footnotes = document.footnotes
context.section = document.sections[0]


@given("a paragraph in a document without footnotes")
def given_a_paragraph_in_a_document_without_footnotes(context: Context):
document = Document(test_docx("par-known-paragraphs"))
context.paragraphs = document.paragraphs
context.footnotes = document.footnotes


@given(
"a document with paragraphs[0] containing one, paragraphs[1] containing none, and paragraphs[2] containing two footnotes"
)
def given_a_document_with_3_footnotes(context: Context):
document = Document(test_docx("footnotes"))
context.paragraphs = document.paragraphs
context.footnotes = document.footnotes


# when ====================================================


@when("I try to access a footnote with invalid reference id")
def when_I_try_to_access_a_footnote_with_invalid_reference_id(context: Context):
context.exc = None
try:
context.footnotes[10]
except IndexError as e:
context.exc = e


@when("I add a footnote to the paragraphs[{parId}] with text '{footnoteText}'")
def when_I_add_a_footnote_to_the_paragraph_with_text_text(
context: Context, parId: str, footnoteText: str
):
par = context.paragraphs[int(parId)]
new_footnote = par.add_footnote()
new_footnote.add_paragraph(footnoteText)


@when("I change footnote property {propName} to {value}")
def when_I_change_footnote_property_propName_to_value(
context: Context, propName: str, value: str
):
context.section.__setattr__(propName, eval(value))


# then =====================================================


@then("len(footnotes) is {expectedLen}")
def then_len_footnotes_is_len(context: Context, expectedLen: str):
footnotes = context.footnotes
assert len(footnotes) == int(
expectedLen
), f"expected len(footnotes) of {expectedLen}, got {len(footnotes)}"


@then("I can access a footnote by footnote reference id")
def then_I_can_access_a_footnote_by_footnote_reference_id(context: Context):
footnotes = context.footnotes
for refId in range(-1, 3):
footnote = footnotes[refId]
assert isinstance(footnote, Footnote)


@then("I can access a paragraph in a specific footnote")
def then_I_can_access_a_paragraph_in_a_specific_footnote(context: Context):
footnotes = context.footnotes
for refId in range(1, 3):
footnote = footnotes[refId]
assert isinstance(footnote.paragraphs[0], Paragraph)


@then("it trows an {exceptionType}")
def then_it_trows_an_IndexError(context: Context, exceptionType: str):
exc = context.exc
assert isinstance(exc, eval(exceptionType)), f"expected IndexError, got {type(exc)}"


@then("I can access footnote property {propName} with value {value}")
def then_I_can_access_footnote_propery_name_with_value_value(
context: Context, propName: str, value: str
):
actual_value = context.section.__getattribute__(propName)
expected = eval(value)
assert (
actual_value == expected
), f"expected section.{propName} {value}, got {expected}"


@then(
"the document contains a footnote with footnote reference id of {refId} with text '{footnoteText}'"
)
def then_the_document_contains_a_footnote_with_footnote_reference_id_of_refId_with_text_text(
context: Context, refId: str, footnoteText: str
):
par = context.paragraphs[1]
f = par.footnotes[0]
assert f.id == int(refId), f"expected {refId}, got {f.id}"
assert (
f.paragraphs[0].text == footnoteText
), f"expected {footnoteText}, got {f.paragraphs[0].text}"


@then(
"paragraphs[{parId}] has footnote reference ids of {refIds}, with footnote text {fText}"
)
def then_paragraph_has_footnote_reference_ids_of_refIds_with_footnote_text_text(
context: Context, parId: str, refIds: str, fText: str
):
par = context.paragraphs[int(parId)]
refIds = eval(refIds)
fText = eval(fText)
if refIds is not None:
if type(refIds) is list:
for i in range(len(refIds)):
f = par.footnotes[i]
assert isinstance(
f, Footnote
), f"expected to be instance of Footnote, got {type(f)}"
assert f.id == refIds[i], f"expected {refIds[i]}, got {f.id}"
assert (
f.paragraphs[0].text == fText[i]
), f"expected '{fText[i]}', got '{f.paragraphs[0].text}'"
else:
f = par.footnotes[0]
assert f.id == int(refIds), f"expected {refIds}, got {f.id}"
assert (
f.paragraphs[0].text == fText
), f"expected '{fText}', got '{f.paragraphs[0].text}'"
else:
assert (
len(par.footnotes) == 0
), f"expected an empty list, got {len(par.footnotes)} elements"
Binary file added features/steps/test_files/footnotes.docx
Binary file not shown.
3 changes: 3 additions & 0 deletions src/docx/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,7 @@
from docx.opc.part import PartFactory
from docx.opc.parts.coreprops import CorePropertiesPart
from docx.parts.document import DocumentPart
from docx.parts.footnotes import FootnotesPart
from docx.parts.hdrftr import FooterPart, HeaderPart
from docx.parts.image import ImagePart
from docx.parts.numbering import NumberingPart
Expand All @@ -43,6 +44,7 @@ def part_class_selector(content_type: str, reltype: str) -> Type[Part] | None:
PartFactory.part_type_for[CT.OPC_CORE_PROPERTIES] = CorePropertiesPart
PartFactory.part_type_for[CT.WML_DOCUMENT_MAIN] = DocumentPart
PartFactory.part_type_for[CT.WML_FOOTER] = FooterPart
PartFactory.part_type_for[CT.WML_FOOTNOTES] = FootnotesPart
PartFactory.part_type_for[CT.WML_HEADER] = HeaderPart
PartFactory.part_type_for[CT.WML_NUMBERING] = NumberingPart
PartFactory.part_type_for[CT.WML_SETTINGS] = SettingsPart
Expand All @@ -53,6 +55,7 @@ def part_class_selector(content_type: str, reltype: str) -> Type[Part] | None:
CorePropertiesPart,
DocumentPart,
FooterPart,
FootnotesPart,
HeaderPart,
NumberingPart,
PartFactory,
Expand Down
49 changes: 49 additions & 0 deletions src/docx/document.py
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,8 @@
if TYPE_CHECKING:
import docx.types as t
from docx.oxml.document import CT_Body, CT_Document
from docx.oxml.footnote import CT_Footnotes, CT_FtnEnd
from docx.oxml.text.paragraph import CT_P
from docx.parts.document import DocumentPart
from docx.settings import Settings
from docx.shared import Length
Expand Down Expand Up @@ -112,6 +114,11 @@ def core_properties(self):
"""A |CoreProperties| object providing Dublin Core properties of document."""
return self._part.core_properties

@property
def footnotes(self) -> CT_Footnotes:
"""A |Footnotes| object providing access to footnote elements in this document."""
return self._part.footnotes

@property
def inline_shapes(self):
"""The |InlineShapes| collection for this document.
Expand Down Expand Up @@ -174,6 +181,10 @@ def tables(self) -> List[Table]:
"""
return self._body.tables

def _add_footnote(self, footnote_reference_ids: int) -> CT_FtnEnd:
"""Inserts a newly created footnote to |Footnotes|."""
return self._part.footnotes.add_footnote(footnote_reference_ids)

@property
def _block_width(self) -> Length:
"""A |Length| object specifying the space between margins in last section."""
Expand All @@ -187,6 +198,44 @@ def _body(self) -> _Body:
self.__body = _Body(self._element.body, self)
return self.__body

def _calculate_next_footnote_reference_id(self, p: CT_P) -> int:
"""Return the appropriate footnote reference id number for
a new footnote added at the end of paragraph `p`."""
# When adding a footnote it can be inserted
# in front of some other footnotes, so
# we need to sort footnotes by `footnote_reference_id`
# in |Footnotes| and in |Paragraph|
new_fr_id = 1
# If paragraph already contains footnotes
# append the new footnote and the end with the next reference id.
if len(p.footnote_reference_ids) > 0:
new_fr_id = p.footnote_reference_ids[-1] + 1
# Read the paragraphs containing footnotes and find where the
# new footnote will be. Keeping in mind that the footnotes are
# sorted by id.
# The value of the new footnote id is the value of the first paragraph
# containing the footnote id that is before the new footnote, incremented by one.
# If a paragraph with footnotes is after the new footnote
# then increment thous footnote ids.
has_passed_containing_para = False
for p_i in reversed(range(len(self.paragraphs))):
# mark when we pass the paragraph containing the footnote
if p is self.paragraphs[p_i]._p:
has_passed_containing_para = True
continue
# Skip paragraphs without footnotes (they don't impact new id).
if len(self.paragraphs[p_i]._p.footnote_reference_ids) == 0:
continue
# These footnotes are after the new footnote, so we increment them.
if not has_passed_containing_para:
self.paragraphs[p_i]._increment_containing_footnote_reference_ids()
else:
# This is the last footnote before the new footnote, so we use its
# value to determent the value of the new footnote.
new_fr_id = max(self.paragraphs[p_i]._p.footnote_reference_ids) + 1
break
return new_fr_id


class _Body(BlockItemContainer):
"""Proxy for `<w:body>` element in this document.
Expand Down
Loading