Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reference Path with unicode doesn't validate #23

Open
derek-pryor opened this issue Apr 24, 2019 · 5 comments
Open

Reference Path with unicode doesn't validate #23

derek-pryor opened this issue Apr 24, 2019 · 5 comments
Labels

Comments

@derek-pryor
Copy link

If a Reference Path contains a unicode character statelint give a validation error. I can create the following definition in the AWS console without error.

{
   "States": {

      "Choice": {
         "Default": "Finish", 
         "Type": "Choice", 
         "Choices": [
            {
               "Variable": "$['£']", 
               "StringEquals": "√", 
               "Next": "Finish"
            },
            {
               "Variable": "$.£", 
               "StringEquals": "√", 
               "Next": "Finish"
            }
         ]
      }, 
      "Finish": {
         "Type": "Succeed"
      }
   }, 
   "StartAt": "Choice"
}
 State Machine.States.Choice.Choices[0].Variable is "$['£']" but should be a Reference Path
 State Machine.States.Choice.Choices[1].Variable is "$.£" but should be a Reference Path

Similar to #17

@timbray
Copy link
Contributor

timbray commented Sep 6, 2019

Sorry, there's a notification problem of some sort and I didn't get alerted about this issue. Sounds like a bug, will have a look.

@nikita-sheremet-clearscale

@timbray
Have you check the issue? It would be great if give a feedback - have aws team plans to fix this in the future?

Thanks.

@wong-a
Copy link
Collaborator

wong-a commented Sep 24, 2020

This appears to be a problem in j2119's JSONPathChecker, which uses a regex to validate JSONPaths.

https://github.com/awslabs/j2119/blob/038f661b7491329ae18986ed9b01b8dd1f18a978/lib/j2119/json_path_checker.rb#L26

@wong-a wong-a added the bug label Sep 24, 2020
@timbray
Copy link
Contributor

timbray commented Sep 24, 2020

Hmm in J2119 json_path_checker.rb, the RE is built using regex character classes like Lt and Mc which I thought were supposed to be unicode-correct. But I look at https://ruby-doc.org/core-2.5.1/Regexp.html and I guess this should use the Posix classes like /[[:alpha:]]/

@timbray
Copy link
Contributor

timbray commented Sep 24, 2020

Hold on, further down it says “A Unicode character's General Category value can also be matched with \p{Ab} where Ab is the category's abbreviation as described below:” so it should be OK. Looking further…

The pattern here is ['£'] so it's complaining that £ isn't doesn't match any of 'Lu', 'Ll', 'Lt', 'Lm', 'Lo', 'Nl' - so that regex insists that the first character be a letter or number, but £ isn't. But now I think the idea is just wrong because in a JSON path when I say $.foo['X'] I can put really any unicode string where X is. So I don't see why bracket_step requires name_re in between the [' and '], any old string should be fine. Of course ' has to be escaped if you want to use it.

Sorry, still thinking. I guess we used name_re because we were thinking of things that can be JavaScript names. But in arbitrary user data, JSON object field names can be any string.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants