Skip to content
This repository has been archived by the owner on Feb 12, 2024. It is now read-only.

major bottleneck in the usage of write-file-atomic #1785

Closed
mcollina opened this issue Dec 18, 2018 · 5 comments
Closed

major bottleneck in the usage of write-file-atomic #1785

mcollina opened this issue Dec 18, 2018 · 5 comments
Labels
exp/expert Having worked on the specific codebase is important P2 Medium: Good to have, but can wait until someone steps up

Comments

@mcollina
Copy link

  • Version: master
  • Platform: Mac OS X
  • Subsystem: unix-fs

I think I have identified the source of the majority of latency (and thus GC) issues in js-ipfs. As noted in ipfs/js-ipfs-repo#183, js-ipfs needs to make sure that files are properly written to disk to avoid missing file issues in case of crashes. Moreover, this is on par with the Go implementation.

The write-file-atomic module on npm is used to implement this behavior. In our latest benchmarks, this is the major bottleneck of synchronous and asynchronous interaction of ipfs.

On the synchronous side, write-file-atomic uses fs.realpath(). In fs.realpath(), the LOOP function is showing up in our flamegraphs because it needs to walk down the path and resolve all links: schermata 2018-12-18 alle 17 07 54. This function alone can account for 5-10% of the total execution time.

On the asynchronous side, write-file-atomic schedules at least schedules 5 async file system operations and 11 microtask queue operation. Moreover, it does a lstat for every part ‘/part/’ of a path, and as many process.nextTick queue. If it’s an a folder at depth 6, this will grow to 10 file system operation, and 11 microtask queue operation, and 10 process.nextTick.

js-ipfs writes a lot of small files to disk, and all of that activity adds up. Note that files are not written in parallel, but rather in sequence, so that all that activity is in the hot path and it increase the overall latency of adding and receiving a file. Some of those fs operations can take up to 20ms to complete: when transferring big files, this would immediately become the major bottleneck of ipfs. As an example, adding a file of 64MB with chunk size of 262144 of a folder that is 5 level deep from the root will cause 256 * 5 * 5 = 6400 asynchronous operations, plus 256 * 20 = 5120 scheduling (Promise+nextTick) operations.

https://upload.clinicjs.org/public/c9d1a87e0ac4c3e3ea7330016f069d9c346ea12f3c97c673eded56aad54311ae/72590.clinic-bubbleprof.html#x11-c194

schermata 2018-12-17 alle 22 04 06

To solve this issue, I recommend:

  1. clarify the algorithm for “atomic writes” with the Go team - ideally reviewing their code.
  2. write a new tiny module for doing atomic writes in the context of ipfs, and replace write-file-atomic
  3. investigate if the files could be processed in parallel instead of serially, ideally over a sliding window approach.
@mcollina mcollina changed the title major bottleneck in write-file-atomic major bottleneck in the usage of write-file-atomic Dec 18, 2018
@mcollina
Copy link
Author

This bubbleprof diagram shows another example of the same issue on the local-transfer benchmark:

https://upload.clinicjs.org/public/1c86eeec6c2c4ee47e0d26d06cec62ee1fe74c745c610d5ba0a11bbe985649aa/31518.clinic-bubbleprof.html

@mcollina
Copy link
Author

Question asked: ipfs/kubo#5863.

@mcollina
Copy link
Author

Here is my proof of concept for a faster write atomic logic: ipfs/js-datastore-fs#21.

@mcollina
Copy link
Author

I've done some more analysis with Bubbleprof on ipfs/js-datastore-fs#21 and ipfs-inactive/js-ipfs-unixfs-importer#9 and ipfs-inactive/js-ipfs-unixfs-importer#10 combined. You can see that the number of scheduled asynchronous activity drops from 34 thousands asynchronous operations to 22 thousands asynchronous operations.

without write-file-atomic

schermata 2018-12-21 alle 12 10 06

with fast-write-atomic

schermata 2018-12-21 alle 12 09 44

@alanshaw alanshaw added status/in-progress In progress exp/expert Having worked on the specific codebase is important P2 Medium: Good to have, but can wait until someone steps up labels Jan 18, 2019
@alanshaw
Copy link
Member

closing as this issue has been resolved by ipfs/js-datastore-fs#21

@ghost ghost removed the status/in-progress In progress label Jan 31, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
exp/expert Having worked on the specific codebase is important P2 Medium: Good to have, but can wait until someone steps up
Projects
None yet
Development

No branches or pull requests

2 participants