-
Notifications
You must be signed in to change notification settings - Fork 92
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Multiline Option with ^ and $ anchors #57
Comments
One more note, in this example byte[] pattern = "^[a-z]{1,10}$".getBytes();
byte[] str = "ab\nab\n".getBytes();
Regex regex = new Regex(pattern, 0, pattern.length, Option.NONE, UTF8Encoding.INSTANCE, Syntax.ECMAScript);
Matcher matcher = regex.matcher(str);
int result = matcher.search(0, str.length, -Option.MULTILINE); result is equal 3. I think this should also be equal to -1 (no found). |
I'm not familiar with the differences in the ECMAScript support in Joni but perhaps @lopex will have something more to say? It might be worth us digging up some ECMAScript regex tests to verify whether this mode is working as it should. |
What I found are official test cases for EcmaScript262 test262 but I did not find them really useful. Much more readable are V8 tests (V8 is the JavaScript engine of Chrome, search for files named .*regexp.*js). For example there are test cases for multline flag. |
Hi, did you have any chance to look at this issue? I would like to bring this thread back |
Maybe the syntax settings just needs fixing ? |
I have checked the oniguruma project and could not find syntax for ECMA (I believe there is no such). There are a lot of different options in this project, do you have any suggestions what is the best approach how to prepare best config for ECMA? |
@kmalski I can see it is marked OP2_OPTION_PERL and that when it sees '^' will set multi true and single false. Not completely sure on direction here but ECMA OR'ing with PERL gives a bunch of default option twiddling in Parser.parseEnclose (look for syntax.op2OptionPerl()). |
@kmalski I think the long term solution would be to remove OP2_OPTION_PERL from ECMA Syntax but this is more complicated since in Parser#parseEnclose we get a lot of behavior from it. As an intermediary step you can update |
You could also just try removing OP2_OPTION_PERL and see if you can see anything break. I suspect yes but _RUBY does not set it and they have many similar features. |
Hi,
I am struggling with proper configuration of
Option
passed tosearch
method with theSyntax.ECMAScript
. I would expect that withOption.DEFAULT
/Option.NONE
regex with usage of^
,$
anchors and no explicit newline will fail with newline character. For exampleshould results with -1 but currently results with 0. Even passing
Option.SINGLELINE
does not change it. What I did to make this work, was to subtract theOption.MULTILINE
I have tested this case with multiple online regex tools and JavaScript regex implementation in my browser and this example always gives me no match (as I expect). Only adding multiline option gives me similar result as with Joni library.
Setting syntax to Java works as expected and gives similar result as this snippet with built-in java regex
Is the MULTILINE option default for library ECMAScript syntax and should it be? I was digging into the ECMAScript and looks like
multiline = false
is the default (user has to explicitly pass m flag).The text was updated successfully, but these errors were encountered: