-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Invalid array length on "large xml" file #20
Comments
I could take a look, but I'd need to be able to reproduce the error first. Can you post the output of npm version Are you running on Mac, Windows or Linux, and if on Mac, is it Intel or M1 or M2? This doesn't look like a mermory limit issue per se, but you could try raising the max memory limit const {validateXML, memoryPages} = require('xmllint-wasm');
validateXML({
...
maxMemoryPages: xmllint.memoryPages.GiB, |
Mac Apple M1 Pro
I did that and increased to the maximum memory possible before creating this issue (and it didn't work), so you are right, it's not a memory limit issue. |
Thanks, I don't have a Mac available but I'll try to reproduce with the other environment specs |
Happy to provide you anything else you might need to help you reproduce it. I attached the XML I used to test. |
This commit aims to fix issue #20. Use the Emscripten FS.writeFile API for accepting XML input files, instead of the createDataFile and especially the intArrayFromString function. Those were inherited from the parent upstream project, but this writeFile API seems to be simpler to use and performs better. The bigger fix, though, is in the output side, as pushing one piece of stdout (I guess it was pushing one byte at a time?) caused the stdoutBuffer array to eventually grow so large that it'd throw > RangeError [Error]: Invalid array length when the output was very big, like when normalizing a big input XML, as described in #20. Here, too, we can switch to the print/printErr APIs, which seem to be not only simpler but also more resilient to the input size growing.
This commit aims to fix issue #20. Use the Emscripten FS.writeFile API for accepting XML input files, instead of the createDataFile and especially the intArrayFromString function. Those were inherited from the parent upstream project, but this writeFile API seems to be simpler to use and performs better. The bigger fix, though, is in the output side, as pushing one piece of stdout (I guess it was pushing one byte at a time?) caused the stdoutBuffer array to eventually grow so large that it'd throw > RangeError [Error]: Invalid array length when the output was very big, like when normalizing a big input XML, as described in #20. Here, too, we can switch to the print/printErr APIs, which seem to be not only simpler but also more resilient to the input size growing.
Thanks for the test XML, I was able to reproduce the issue. I have a possible (not yet very well tested) fix, here: v5.0.0-alpha (PR #21). Could you test with this prerelease version, e.g. by installing it with npm i xmllint-wasm@https://github.com/noppa/xmllint-wasm/releases/download/v5.0.0-alpha/xmllint-wasm.tgz |
The solution from #21 works. I have tested with several large XMLs (all UTF-8 encoded) and everything is flawless 👌 |
Sweet. I'm a bit busy for a few days but will try to test and craft a proper release soonish. Meanwhile, the github alpha release should work fine. |
This commit aims to fix issue #20. Use the Emscripten FS.writeFile API for accepting XML input files, instead of the createDataFile and especially the intArrayFromString function. Those were inherited from the parent upstream project, but this writeFile API seems to be simpler to use and performs better. The bigger fix, though, is in the output side, as pushing one piece of stdout (I guess it was pushing one byte at a time?) caused the stdoutBuffer array to eventually grow so large that it'd throw > RangeError [Error]: Invalid array length when the output was very big, like when normalizing a big input XML, as described in #20. Here, too, we can switch to the print/printErr APIs, which seem to be not only simpler but also more resilient to the input size growing.
This is also now in npm, with version 5.0.0-rc.0. I'll also test this in prod a bit before I dare to tag that as the latest release. |
There seems to be a limit on the size of the XML that the tool can validate. The following error in thrown for a "large XML file":
RangeError [Error]: Invalid array length
I was able to reproduce the issue on an XML file that surpasses
134_217_724
bytes, or in hex,0x7FFFFFC
. Seems like a weird limit though, but pretty sure this is not anxmllint
issue, as I'm able to validate the file via command line.Is it possible to take a look at this issue?
The text was updated successfully, but these errors were encountered: