-
Notifications
You must be signed in to change notification settings - Fork 143
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Prevents Locale Code build failures #2967 #2980
Prevents Locale Code build failures #2967 #2980
Conversation
Depending on the GitHub CI workflow executes it will invoke ./gradlew build which will run tests on random parameters everytime, there are local-code's that will break a build one of them being az-Cyrl. The solution here was preventing (explicitly writing the local to be set to en-US) Github actions building on these random parameters that will interupt a build and lose time. Manual testing was done to prove that using this flag breaks a build, you can check the issue opensearch-project#2967 or you can run ./gradlew build -Dtests.local=az-Cyrl to see it breaks Signed-off-by: Brian Flores <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems like the wrong fix for the issue.
There's clearly a locale-based bug somewhere but instead of fixing it you're ignoring it for the entire build, even for tests unaffected by the issue. This would miss issues such as #1734
All of the failing tests in the linked issue are in the RestMLPredictionActionTests
class, so if those tests require en-US
locale that should be scoped only to those tests.
However, locale based bugs are very frequently using upper- or lower-case string transformations. This line has a toUpperCase()
transformation without a locale and is very likely the bug that needs fixing by adding Locale.ROOT
. (There may be others.)
ml-commons/plugin/src/main/java/org/opensearch/ml/rest/RestMLPredictionAction.java
Line 132 in 6a6cac1
} else if (FunctionName.isDLModel(FunctionName.from(algorithm.toUpperCase())) && !mlFeatureEnabledSetting.isLocalModelEnabled()) { |
A brief search of the entire repo shows several similar potential locale issues, you should probably fix them all!
- https://github.com/search?q=repo%3Aopensearch-project%2Fml-commons%20touppercase()&type=code
- https://github.com/search?q=repo%3Aopensearch-project%2Fml-commons+tolowercase%28%29&type=code
If you want to go a step further, converting a string by upper/lower casing just for a comparison is inefficient as it creates a whole new string object which is immediately discarded. If done frequently this adds to the heap/GC load. Use .equalsIgnoreCase()
when you can to avoid this extra object creation.
Update: this is the specific bug location, but as noted above you should fix all of them. ml-commons/common/src/main/java/org/opensearch/ml/common/connector/ConnectorAction.java Line 211 in 6a6cac1
|
Hey Dan! Thanks for the insight, I agree with you I did feel skeptical making the change Ill get to work with updating the string methods to use |
The previous commit made a hard change on the build while ignoring the root problem, which was making sure that our codebase currently supports string operations regardless of the locale code. In this new commit String operations like toUpperCase have a extra argument of Locale.Root making the codebase agnostic to the rules of other langugages such as Spanish or Turkish. Manual testing was done like raised in the GitHub issue opensearch-project#2967 also ./gradlew build -Dtests.Local=az-Cyrl passes Signed-off-by: Brian Flores <[email protected]>
Thanks for making the changes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM Someone should probably do an org-wide search for these. :)
(Edit: just did 172 of them! Holy cow!)
* Prevents future UT build issues addresses #2967 Depending on the GitHub CI workflow executes it will invoke ./gradlew build which will run tests on random parameters everytime, there are local-code's that will break a build one of them being az-Cyrl. The solution here was preventing (explicitly writing the local to be set to en-US) Github actions building on these random parameters that will interupt a build and lose time. Manual testing was done to prove that using this flag breaks a build, you can check the issue #2967 or you can run ./gradlew build -Dtests.local=az-Cyrl to see it breaks Signed-off-by: Brian Flores <[email protected]> * Updates current codebase with Local.Root to string operations The previous commit made a hard change on the build while ignoring the root problem, which was making sure that our codebase currently supports string operations regardless of the locale code. In this new commit String operations like toUpperCase have a extra argument of Locale.Root making the codebase agnostic to the rules of other langugages such as Spanish or Turkish. Manual testing was done like raised in the GitHub issue #2967 also ./gradlew build -Dtests.Local=az-Cyrl passes Signed-off-by: Brian Flores <[email protected]> --------- Signed-off-by: Brian Flores <[email protected]> (cherry picked from commit cd83590)
Thanks Dan, can you create issues for the impacted repos ? |
Yep, sometime next week. Filtering to just Java files it's only 66. |
* Prevents future UT build issues addresses #2967 Depending on the GitHub CI workflow executes it will invoke ./gradlew build which will run tests on random parameters everytime, there are local-code's that will break a build one of them being az-Cyrl. The solution here was preventing (explicitly writing the local to be set to en-US) Github actions building on these random parameters that will interupt a build and lose time. Manual testing was done to prove that using this flag breaks a build, you can check the issue #2967 or you can run ./gradlew build -Dtests.local=az-Cyrl to see it breaks Signed-off-by: Brian Flores <[email protected]> * Updates current codebase with Local.Root to string operations The previous commit made a hard change on the build while ignoring the root problem, which was making sure that our codebase currently supports string operations regardless of the locale code. In this new commit String operations like toUpperCase have a extra argument of Locale.Root making the codebase agnostic to the rules of other langugages such as Spanish or Turkish. Manual testing was done like raised in the GitHub issue #2967 also ./gradlew build -Dtests.Local=az-Cyrl passes Signed-off-by: Brian Flores <[email protected]> --------- Signed-off-by: Brian Flores <[email protected]> (cherry picked from commit cd83590) Co-authored-by: Brian Flores <[email protected]>
Description
Depending on the GitHub CI workflow executes it will invoke ./gradlew build which will run tests on random parameters every time, there are locale-code's that will break a build one of them being az-Cyrl. The solution was to update the codebase that involved string operations such as
toUpperCase(), toLowerCase()
totoUpperCase(Locale.Root), toLowerCase(Locale.Root)
In doing this we make the string operations language agnostic meaning it wont be affected on a computer that is set to a different language like Spanish or Turkish.Testing
./gradlew build -DTests.Locale=az-Cyrl
which previously failed now works./gradlew build
builds successfully too.TLDR:
Locale.Root
as a argument to avoid default Locale (which may depend on one that is not supported) you can see this here .Related Issues
Resolves #2967
Check List
--signoff
.By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.