Use Composer/Pcre Part 1 of Many #4323
Merged
+384
−188
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
The native preg functions (preg_match, preg_replace, etc.) often require us to add a lot of useless boilerplate code to satisfy Phpstan, Scrutinizer, etc. Composer/Pcre offers us a way to remove that boilerplate, thereby giving us a cleaner codebase. I decided to try it on a few modules, and saw a result that clearly demonstrated the usefulness of doing this (aside from the cleaner codebase).
Sample 22_Reader_issue1767 reads an Xlsx spreadsheet with complex sheet names used in defined names, and writes it to Xlsx and Xls output files. When I changed Writer/Xls/Parser to use Composer/Pcre, this sample failed in many different places writing the Xls file. It turns out that some regexes were failing not because the string didn't match, but because the regex encountered "catastrophic backtracking". Composer/Pcre throws an exception when this happens; the native preg_match does return false, but we were not checking for that. The regexes in question are now changed to something which works, and formal unit tests are added for them. Finding this previously undetected error indicates that we should proceed with this change.
An alternative to using Composer/Pcre would be to test for false after all the preg calls. I have done this in the two samples changed with this PR. That seems adequate for a small number of changes, but it really just makes for more clutter considering the large number of regexps that we use in our code. I think Composer/Pcre is a better choice.
It isn't quite transparent. Composer forces all regexps to use PREG_UNMATCHED_AS_NULL, so some match fields will now be null instead of null-string (or non-existent if the unmatched field comes at the end). Our test suite doesn't report any problem (yet) due to this change, although Phpstan is sensitive to it. Several Phpstan annotations were eliminated due to this change.
It is not necessary to do this all at once. This PR addresses all the calls in Writer. I intend to address other components in several tickets.
This is:
Checklist:
Why this change is needed?
Provide an explanation of why this change is needed, with links to any Issues (if appropriate).
If this is a bugfix or a new feature, and there are no existing Issues, then please also create an issue that will make it easier to track progress with this PR.