Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error: EBADF: bad file descriptor, read when parsing over Open.file() #104

Closed
ThePlenkov opened this issue Jan 29, 2019 · 15 comments
Closed

Comments

@ThePlenkov
Copy link

Dear @ZJONSSON

Thank you for Open.file() which saves actually a lot of time . However most of times I have a very strange error. The problem is - it's very random, always raising after different files:

events.js:167
      throw er; // Unhandled 'error' event
      ^

Error: EBADF: bad file descriptor, read
Emitted 'error' event at:
    at lazyFs.read (internal/fs/streams.js:165:12)
    at FSReqWrap.wrapper [as oncomplete] (fs.js:463:17)

Here is the sample code how I use it:

/* eslint-disable no-console */
const unzipper = require("unzipper");
const path = require("path");
const fs = require("fs");

(async () => {
  let oZip = await unzipper.Open.file("./sapui5-rt-1.44.42.zip");

  // extract theme source files
  let aFilesToExtract = oZip.files.filter(oFile =>
    /(?<=\/sap_belize.+)\.less$/.test(oFile.path)
  );

  while (aFilesToExtract.length) {
    let oFile = aFilesToExtract.shift();
    console.log(`Extracting: ${oFile.path}`);

    //keeping the reference
    try {
      await new Promise((resolve, reject) => {
        let sDir = path.dirname(oFile.path);
        if (!fs.existsSync(sDir)) {
          fs.mkdirSync(sDir, { recursive: true });
        }

        let oFileNew = fs.createWriteStream(oFile.path);

        oFile
          .stream()
          .pipe(oFileNew)
          .on("finish", resolve)
          .on("error", reject);
      });
    } catch (error) {
      console.error(error);
      //do something
    }
  }
})();

this is the reference to the file: https://tools.hana.ondemand.com/additional/sapui5-rt-1.44.42.zip

Do you have any ideas what it can be?

THank you!

@ThePlenkov
Copy link
Author

it looks like the problem is in using promise inside the while. I just tested with the simple forEach and it works fine. I wanted to prevent memory leakage. But it seems that's the reason.

@DavinAhn
Copy link

Dear @ThePlenkov

I had same the problem.

The way I solved was through comment in Node source (based on 10.15.0).
With reference to the comment, I kept stream reference for a period of time, and the problem no longer occurred.

Try modifying your code like this:

        const stream = oFile
          .stream()
          .pipe(oFileNew)
          .on("finish", () => {
            setTimeout(() => stream, 100); // Below 70, the problem still occurred.
            resolve();
          })
          .on("error", reject);

But, I do not know if the magic number 100 of the timeout is valid in all environments.

Thank you for reading my comment.

@ThePlenkov
Copy link
Author

@DavinAhn haha, nice!))

@gfx
Copy link

gfx commented Apr 25, 2019

I have the same issue and I believe the issue should be open until solved.

@danielweck
Copy link

I am able to reproduce this random unzipper crash too. I execute the same routines with yauzl and node-zip-stream as part of a benchmark suite, and only unzipper exhibits this behaviour.

@michalczukm
Copy link

Same story here. @danielweck can you share example with yauzl?

@danielweck
Copy link

Stacktrace / debug info:

events.js:174
      throw er; // Unhandled 'error' event
      ^

Error: EBADF: bad file descriptor, read
Emitted 'error' event at:
    at lazyFs.read (internal/fs/streams.js:165:12)
    at FSReqWrap.wrapper [as oncomplete] (fs.js:467:17)

node --version
=>
v10.16.3

https://github.com/nodejs/node/blob/v10.16.3/lib/fs.js#L467

  function wrapper(err, bytesRead) {
    // Retain a reference to buffer so that it can't be GC'ed too soon.
    callback && callback(err, bytesRead || 0, buffer);
  }

https://github.com/nodejs/node/blob/v10.16.3/lib/internal/fs/streams.js#L165

  // the actual read.
  lazyFs().read(this.fd, pool, pool.used, toRead, this.pos, (er, bytesRead) => {
    if (er) {
      if (this.autoClose) {
        this.destroy();
      }
      this.emit('error', er);
    }

@ZJONSSON
Copy link
Owner

Thanks @danielweck - can you give me more detail on how I can also replicate the problem on my side, i.e. your test suite, do you have a branch or a commit hash I can run tests that fail? Would be really valuable to get to the bottom of this and fix it.

It's worth noting that the unzipper.Parse is the legacy implementation within unzipper. The Open methods are far more robust, stream only the files of interest and properly use the information in the central directory.

@danielweck
Copy link

See #157

@danielweck
Copy link

Same story here. @danielweck can you share example with yauzl?

Here is a repro script which can be used to compare unzipper (the only lib that crashes in my tests), yauzl and node-zip-stream:
#157

I should point out that I tried various process.nextTick() and setTimeout() techniques in order to prevent premature garbage collection of unzipper streams (in fact, I even tried storing the stream references in a global array ... to no avail):

file: function(filename, options) {
var source = {
stream: function(offset,length) {
return fs.createReadStream(filename,{start: offset, end: length && offset+length});
},

The bug does not occur consistently, and is therefore hard to reproduce. Which is why a repro script comes-in handy.

I suspect that devs who tried the setTimeout() technique "successfully" (as mentioned in this discussion thread), didn't stress-test hard enough, resulting in a false sense of success. Using a script to automate the opening/closing of ZIP entry streams, I am able to reproduce the bug quite regularly.

On the other hand, I have never managed to make yauzl and node-zip-stream crash ... and, I am sorry to say ( :( ), these libs deliver better performance (faster processing) than unzipper. I use a benchmark script very similar to the one in my Pull Request in order to collect performance metrics, across a variety of ZIP file (i.e. many small ZIP entries, vs. few large ZIP entries). My ZIP files are EPUB publications, by the way.
The processing tasks I test for are:

  • ZIP directory parse (so I can read and compare the CRC fields supplied by the ZIP headers)
  • ZIP entry streaming (typically, to extract entire resources into full-length buffers, but also for "streaming" audio and video files, e.g. to respond to HTTP partial range requests)

@danielweck
Copy link

danielweck commented Oct 17, 2019

Just a thought: unzipper uses graceful-fs instead of Node's built-in fs module.

return fs.createReadStream(filename,{start: offset, end: length && offset+length});

Conversely, yauzl and node-stream-zip use regular fs:

https://github.com/antelle/node-stream-zip/blob/master/node_stream_zip.js#L165

https://github.com/thejoshwolfe/yauzl/blob/02a5ca69c7713f6d2897cc02f2acc1df21093e3d/index.js#L33

graceful-fs functions as a drop-in replacement for the fs module, making various improvements.
The improvements are meant to normalize behavior across different platforms and environments, and to make filesystem access more resilient to errors.

https://github.com/isaacs/node-graceful-fs

@danielweck
Copy link

Quick follow-up: I've replace var fs = require('graceful-fs'); with var fs = require('fs'); in lib/Open/index.js and I was able to reproduce the crash pretty much immediately, so this change makes no difference.

node "issue-104.js" "./testData/accessible_epub_3.epub" "500" "1"
process.cwd():
/xx/node-unzipper
__dirname:
/xx/node-unzipper
args:
[ './testData/accessible_epub_3.epub', '500', '1' ]
./testData/accessible_epub_3.epub
=====================================
./testData/accessible_epub_3.epub
=====================================
1/500 ......................................
2/500 ......................................
...
...
111/500 ......................................
112/500 ..................events.js:174
      throw er; // Unhandled 'error' event
      ^

Error: EBADF: bad file descriptor, read
Emitted 'error' event at:
    at lazyFs.read (internal/fs/streams.js:165:12)
    at FSReqWrap.wrapper [as oncomplete] (fs.js:467:17)

@danielweck
Copy link

I updated my pull request with a possible fix / workaround:
#157 (comment)
The key is to avoid using fs.createReadStream() (graceful-fs or regular fs).

@jdmarshall
Copy link

jdmarshall commented Oct 25, 2019

Any workarounds without changes to the library?
I'm seeing this issue with Open.file() as well.

@danielweck
Copy link

Based on my experience ; i.e. empirical results from stress-testing the unzipper lib using an automation script, see #157 ; the problem seems to be related to how/when garbage collection disposes of fs.createReadStream() ... which is why I experimented in my PR with an alternative method that requires manually closing the zip stream. This seems to solve the problem, but it is a breaking API / lifecycle change (albeit, one that is aligned with the other two well-known zip "streaming" libs: yauzl and node-stream-zip).

seebees added a commit to seebees/aws-encryption-sdk-javascript that referenced this issue Feb 27, 2020
`unzipper` would throw an FD error occationaly.
Relevant issue: ZJONSSON/node-unzipper#104
There is a pending PR,
but it seemed simpler to simply move to `yauzl`.

The error:

```
events.js:174
      throw er; // Unhandled 'error' event
      ^

Error: EBADF: bad file descriptor, read
Emitted 'error' event at:
    at lazyFs.read (internal/fs/streams.js:165:12)
    at FSReqWrap.wrapper [as oncomplete] (fs.js:467:17)
```
seebees added a commit to seebees/aws-encryption-sdk-javascript that referenced this issue Feb 27, 2020
`unzipper` would throw an FD error occasionally.
Relevant issue: ZJONSSON/node-unzipper#104
There is a pending PR,
but it seemed simpler to simply move to `yauzl`.

The error:

```
events.js:174
      throw er; // Unhandled 'error' event
      ^

Error: EBADF: bad file descriptor, read
Emitted 'error' event at:
    at lazyFs.read (internal/fs/streams.js:165:12)
    at FSReqWrap.wrapper [as oncomplete] (fs.js:467:17)
```
seebees added a commit to seebees/aws-encryption-sdk-javascript that referenced this issue Feb 27, 2020
`unzipper` would throw an FD error occasionally.
Relevant issue: ZJONSSON/node-unzipper#104
There is a pending PR,
but it seemed simpler to simply move to `yauzl`.

The error:

```
events.js:174
      throw er; // Unhandled 'error' event
      ^

Error: EBADF: bad file descriptor, read
Emitted 'error' event at:
    at lazyFs.read (internal/fs/streams.js:165:12)
    at FSReqWrap.wrapper [as oncomplete] (fs.js:467:17)
```
seebees added a commit to aws/aws-encryption-sdk-javascript that referenced this issue Feb 27, 2020
`unzipper` would throw an FD error occasionally.
Relevant issue: ZJONSSON/node-unzipper#104
There is a pending PR,
but it seemed simpler to simply move to `yauzl`.

The error:

```
events.js:174
      throw er; // Unhandled 'error' event
      ^

Error: EBADF: bad file descriptor, read
Emitted 'error' event at:
    at lazyFs.read (internal/fs/streams.js:165:12)
    at FSReqWrap.wrapper [as oncomplete] (fs.js:467:17)
```
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants