Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fs: system: optimize reflink #234

Merged
merged 1 commit into from
Dec 6, 2023
Merged

fs: system: optimize reflink #234

merged 1 commit into from
Dec 6, 2023

Conversation

efiop
Copy link
Contributor

@efiop efiop commented Dec 6, 2023

10000 reflinks now take about 1sec instead of 2.7sec

Related https://github.com/iterative/dql/pull/1007

@efiop efiop added the performance improvement over resource / time consuming tasks label Dec 6, 2023
@efiop efiop self-assigned this Dec 6, 2023
@efiop efiop force-pushed the reflink branch 3 times, most recently from adc9a34 to b7d8a2a Compare December 6, 2023 00:41
10000 reflinks now take about 1sec instead of 2.7sec
@codecov-commenter
Copy link

codecov-commenter commented Dec 6, 2023

Codecov Report

Attention: 3 lines in your changes are missing coverage. Please review.

Comparison is base (8740236) 64.39% compared to head (ca71102) 64.53%.

Files Patch % Lines
tests/benchmarks/test_fs.py 60.00% 2 Missing ⚠️
src/dvc_objects/fs/system.py 88.88% 0 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #234      +/-   ##
==========================================
+ Coverage   64.39%   64.53%   +0.13%     
==========================================
  Files          27       27              
  Lines        2039     2047       +8     
  Branches      323      324       +1     
==========================================
+ Hits         1313     1321       +8     
  Misses        666      666              
  Partials       60       60              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@efiop efiop merged commit dc95f2e into iterative:main Dec 6, 2023
13 checks passed
def _reflink_darwin(src: "AnyFSPath", dst: "AnyFSPath") -> None:
import ctypes

clonefile = _clonefile()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we move argtypes and restype up too?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

totally, my oversight



def _reflink_darwin(src: "AnyFSPath", dst: "AnyFSPath") -> None:
import ctypes
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider moving this outside.

@@ -36,7 +37,8 @@ def symlink(source: "AnyFSPath", link_name: "AnyFSPath") -> None:
os.symlink(source, link_name)


def _reflink_darwin(src: "AnyFSPath", dst: "AnyFSPath") -> None:
@functools.lru_cache(maxsize=1)
def _clonefile():
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it worth running a logger.debug? Is that expected?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would rather know that we are falling back there, so yeah. This is ran 1 time worst case, so not much reason to remove.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, that's true.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
performance improvement over resource / time consuming tasks
Projects
No open projects
Archived in project
Development

Successfully merging this pull request may close these issues.

3 participants