Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow name-char as first character of unquoted-literal #990

Merged
merged 3 commits into from
Feb 12, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
14 changes: 1 addition & 13 deletions spec/functions/datetime.md
Original file line number Diff line number Diff line change
Expand Up @@ -74,19 +74,7 @@ and what format to use for that field.
> [!NOTE] > _Field options_ do not have default values because they are only to be used
> to compose the formatter.

The _field options_ are defined as follows:

> [!IMPORTANT]
> The value `2-digit` for some _field options_ MUST be quoted
> in the MessageFormat syntax because it starts with a digit
> but does not match the `number-literal` production in the ABNF.
>
> ```
> .local $correct = {$someDate :datetime year=|2-digit|}
> .local $syntaxError = {$someDate :datetime year=2-digit}
> ```

The function `:datetime` has the following options:
The function `:datetime` has the following _field options_:

- `weekday`
- `long`
Expand Down
8 changes: 6 additions & 2 deletions spec/functions/number.md
Original file line number Diff line number Diff line change
Expand Up @@ -655,9 +655,13 @@ Implementations MUST NOT substitute the unit without performing the associated c
### Number Operands

The _operand_ of a number function is either an implementation-defined type or
a literal whose contents match the `number-literal` production in the [ABNF](/spec/message.abnf).
a literal whose contents match the following `number-literal` production.
All other values produce a _Bad Operand_ error.

```abnf
number-literal = ["-"] (%x30 / (%x31-39 *DIGIT)) ["." 1*DIGIT] [%i"e" ["-" / "+"] 1*DIGIT]
```

> For example, in Java, any subclass of `java.lang.Number` plus the primitive
> types (`byte`, `short`, `int`, `long`, `float`, `double`, etc.)
> might be considered as the "implementation-defined numeric types".
Expand All @@ -667,7 +671,7 @@ All other values produce a _Bad Operand_ error.
> [!NOTE]
> String values passed as variables in the _formatting context_'s
> _input mapping_ can be formatted as numeric values as long as their
> contents match the `number-literal` production in the [ABNF](/spec/message.abnf).
> contents match the `number-literal` production.
>
> For example, if the value of the variable `num` were the string
> `-1234.567`, it would behave identically to the local
Expand Down
4 changes: 1 addition & 3 deletions spec/message.abnf
Original file line number Diff line number Diff line change
Expand Up @@ -41,9 +41,7 @@ variable = "$" name

literal = quoted-literal / unquoted-literal
quoted-literal = "|" *(quoted-char / escaped-char) "|"
unquoted-literal = name / number-literal
; number-literal matches JSON number (https://www.rfc-editor.org/rfc/rfc8259#section-6)
number-literal = ["-"] (%x30 / (%x31-39 *DIGIT)) ["." 1*DIGIT] [%i"e" ["-" / "+"] 1*DIGIT]
unquoted-literal = 1*name-char
aphillips marked this conversation as resolved.
Show resolved Hide resolved

; Keywords; Note that these are case-sensitive
input = %s".input"
Expand Down
10 changes: 4 additions & 6 deletions spec/syntax.md
Original file line number Diff line number Diff line change
Expand Up @@ -735,19 +735,17 @@ escaped as `\\` and `\|`.
An **_<dfn>unquoted literal</dfn>_** is a _literal_ that does not require the `|`
quotes around it to be distinct from the rest of the _message_ syntax.
An _unquoted literal_ MAY be used when the content of the _literal_
contains no whitespace and otherwise matches the `unquoted` production.
contains no whitespace and otherwise matches the `unquoted-literal` production.
Implementations MUST NOT distinguish between _quoted literals_ and _unquoted literals_
that have the same sequence of code points.

_Unquoted literals_ can contain a _name_ or consist of a _number-literal_.
A _number-literal_ uses the same syntax as JSON and is intended for the encoding
of number values in _operands_ or _options_, or as _keys_ for _variants_.
_Unquoted literals_ can contain any characters also valid in _name_,
less _name_'s additional restrictions on the first character.

```abnf
literal = quoted-literal / unquoted-literal
quoted-literal = "|" *(quoted-char / escaped-char) "|"
unquoted-literal = name / number-literal
number-literal = ["-"] (%x30 / (%x31-39 *DIGIT)) ["." 1*DIGIT] [%i"e" ["-" / "+"] 1*DIGIT]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yet another step towards making everything a string :-(

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that comment is a confusion about thinking that quoted literal mean string, when they are completely separate.

The main syntax shouldn't talk about what format literal numbers are, or what format literal dates are, or what format literal units are; that's up to the functions.

unquoted-literal = 1*name-char
```

### Names and Identifiers
Expand Down
2 changes: 1 addition & 1 deletion test/tests/functions/datetime.json
Original file line number Diff line number Diff line change
Expand Up @@ -45,7 +45,7 @@
"src": "{|2006-01-02T15:04:06| :datetime}"
},
{
"src": "{|2006-01-02T15:04:06| :datetime year=numeric month=|2-digit|}"
"src": "{|2006-01-02T15:04:06| :datetime year=numeric month=2-digit}"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With this change the parsers now need a pretty big lookahead.

Before you it was enough to look at the first character:
| => quoted string
0-9 or '-' > try to get a number literal
starting char => literal
anything else => Error

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How does that change? Anywhere that literal is valid, single-character lookahead still suffices.

  • | → quoted-literal
  • $ → variable
  • : → function
  • * → key
  • # or / → (markup)
  • name-char → unquoted-literal
  • anything else → error

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with @gibson042.

It is trivial to check the first character. A number literal can only start with one of 11 characters: - or 0..9. All of these are contained in name-char:

name-char  = name-start / DIGIT / "-" / "."
           / %xB7 / %x300-36F / %x203F-2040

So the first character is enough to send you down the path.

Of course, once you start down the path to any of these outcomes, you could end up with something bogus

},
{
"src": "{|2006-01-02T15:04:06| :datetime dateStyle=long}"
Expand Down
6 changes: 5 additions & 1 deletion test/tests/functions/integer.json
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,11 @@
"exp": "hello -4"
},
{
"src": "hello {0.42e+1 :integer}",
"src": "hello {0.42 :integer}",
"exp": "hello 0"
},
{
"src": "hello {|0.42e+1| :integer}",
"exp": "hello 4"
},
{
Expand Down
6 changes: 5 additions & 1 deletion test/tests/functions/number.json
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,11 @@
"exp": "hello -4.2"
},
{
"src": "hello {0.42e+1 :number}",
"src": "hello {0.42 :number}",
"exp": "hello 0.42"
},
{
"src": "hello {|0.42e+1| :number}",
"exp": "hello 4.2"
},
{
Expand Down
28 changes: 14 additions & 14 deletions test/tests/syntax.json
Original file line number Diff line number Diff line change
Expand Up @@ -421,72 +421,72 @@
]
},
{
"description": "... literal -> quoted-literal -> \"|\" \"|\" ...",
"description": "... quoted-literal",
"src": "{||}",
"exp": ""
},
{
"description": "... quoted-literal -> \"|\" quoted-char \"|\"",
"description": "... quoted-literal",
"src": "{|a|}",
"exp": "a"
},
{
"description": "... quoted-literal -> \"|\" escaped-char \"|\"",
"description": "... quoted-literal",
"src": "{|\\\\|}",
"exp": "\\"
},
{
"description": "... quoted-literal -> \"|\" quoted-char 1*escaped-char \"|\"",
"description": "... quoted-literal",
"src": "{|a\\\\\\{\\|\\}|}",
"exp": "a\\{|}"
},
{
"description": "... unquoted-literal -> number-literal -> %x30",
"description": "... unquoted-literal",
"src": "{0}",
"exp": "0"
},
{
"description": "... unquoted-literal -> number-literal -> \"-\" %x30",
"description": "... unquoted-literal",
"src": "{-0}",
"exp": "-0"
},
{
"description": "... unquoted-literal -> number-literal -> (%x31-39 *DIGIT) -> %x31",
"description": "... unquoted-literal",
"src": "{1}",
"exp": "1"
},
{
"description": "... unquoted-literal -> number-literal -> (%x31-39 *DIGIT) -> %x31 DIGIT -> 11",
"description": "... unquoted-literal",
"src": "{11}",
"exp": "11"
},
{
"description": "... unquoted-literal -> number-literal -> %x30 \".\" 1*DIGIT -> 0 \".\" 1",
"description": "... unquoted-literal",
"src": "{0.1}",
"exp": "0.1"
},
{
"description": "... unquoted-literal -> number-literal -> %x30 \".\" 1*DIGIT -> %x30 \".\" DIGIT DIGIT -> 0 \".\" 1 2",
"description": "... unquoted-literal",
"src": "{0.12}",
"exp": "0.12"
},
{
"description": "... unquoted-literal -> number-literal -> %x30 %i\"e\" 1*DIGIT -> %x30 \"e\" DIGIT",
"description": "... unquoted-literal",
"src": "{0e1}",
"exp": "0e1"
},
{
"description": "... unquoted-literal -> number-literal -> %x30 %i\"e\" 1*DIGIT -> %x30 \"E\" DIGIT",
"description": "... unquoted-literal",
"src": "{0E1}",
"exp": "0E1"
},
{
"description": "... unquoted-literal -> number-literal -> %x30 %i\"e\" \"-\" 1*DIGIT ...",
"description": "... unquoted-literal",
"src": "{0E-1}",
"exp": "0E-1"
},
{
"description": "... unquoted-literal -> number-literal -> %x30 %i\"e\" \"+\" 1*DIGIT ...",
"description": "... unquoted-literal",
"src": "{0E-1}",
"exp": "0E-1"
},
Expand Down