Feat/file #54

bozhang-hpc · 2024-09-12T19:58:24Z

This PR adds a swap space for dspaces when the staging memory is about to be insufficient.

User can specify the staging memory quota in the dspaces_conf.toml file, and when the put_rpc() on the server side is going to allocate the buffer for bulk data transfer, it checks if the memory quota is reached and starts to write the staged data objects into a hdf5 file.
The policy of which staged data object is poped out can be also specified in the dspaces_conf.toml file. Currently, we support FIFO (default), LIFO, and LRU.

When reading the data, specifically, calling get_rpc() on the server side, it will first check if the queried data object is in the staging memory. If not found, it will try to read the data back from the hdf5 file to the staging memory. If new allocated buffer for reading also triggers the staging memory quota, it will first pop objects from the staging memory according to the user-specified policy until the server has enough memory for reading the data back.

…riting for who creates the hdf5 file

philip-davis · 2024-09-19T12:03:24Z

@bozhang-hpc Thanks for creating this feature. I think most of this makes good sense. I left review comments. A summary and some additional items:

There are conflicts that prevent merging with the main repo.
Tests are failing; possibly due to issues that would be resolved by a merge
Could you please add tests so we know swapping is working when we modify the code in the future
Make this feature conditionally enabled and depend on the availability of HDF5. I don't think we should be unconditionally dependent on HDF5.
There should be a mode for a virtual quota, incremented and decremented by local storage options. It's great we look at /proc/meminfo, but there are reasons for the user to do otherwise.
I identified a couple concurrency issues. The critical section for swapping should be bigger. Also, there may be an issue with concurrent hdf5 file access (not sure).
A couple code cleanup operations noted in the review.
Make sure to validate/regularize code formatting with clang-format. Oddly, the CI job on the website passed but when I run it locally with act it doesn't.

philip-davis

comment in PR

CMakeLists.txt

include/util.h

src/CMakeLists.txt

src/file_storage/policy.c

philip-davis · 2024-09-19T10:40:52Z

src/file_storage/policy.c

+{
+    meminfo_t meminfo = parse_meminfo();
+
+    if(meminfo.MemAvailableMiB  < (size_MB + meminfo.MemTotalMiB - threshold))


There should be a mode to tally this synthetically (i.e. increment in ls_add and decrement in ls_free and act when that tally passes some threshold). A user might (and probably will) want to ration server ranks in order to maintain a certain headroom per process.

Yes, we can definitely add a memory usage keeper for ls_add() and ls_remove(). But the problem is, and also the reason why I didn't implement this way is: we are not actually allocating a new memory region when we add od to ls (ls_remove() really releases the memory), but we allocate the memory for od before the bulk transfer, and assign the od->data as the bulk transfer destination, then we attach this od's ptr to the ls. So if we really want to keep track of the memory usage to prevent the overflow, we might already got segfault before the checking in ls_add()

We can address the order of operations by making a memory tally for local storage and doing setters and getters for it. In reality, running out of physical memory won't segfault, it will start swapping into virtual memory.

src/util.c

src/dspaces-server.c

src/dspaces-conf.c

…d FILEBACKEND option for swap

…swap headers; add find_package for NetCDF

…ther functions to it

philip-davis · 2024-09-28T04:29:29Z

CMakeLists.txt

+endif()
+
+if(HDF5_FOUND OR NetCDF_FOUND)
+    option(FILEBACKEND_FOUND "Option to Enable File Swap on DataSpaces Server" ON)


Might want set here instead of option.

https://cmake.org/cmake/help/latest/command/option.html

… backend not user-specified

bozhang-hpc added 14 commits August 30, 2024 17:55

add hdf5 backend for data swap

ecbd7a1

add swap policy and swap logic to put() & get()

fd4e8fa

fix comments

a151dd9

add toml default for swap space; add simple file debug logic

f05bf48

Merge branch 'master' into feat/file

2ae4152

fix conf header

3e3c3a5

add swap config

90ac132

add directory operations

e66c849

change default swap dir path

21cc604

fix creating hdf5 files in a non-existing directory; fix no dataset w…

44295a2

…riting for who creates the hdf5 file

fix write_od arguments; add rm swap dir at server finalization

33aeacd

fix bbox init in read_od; fix hdf5 dataset name

002675b

fix dspaces DATATYPE translation in get()

8f2312b

add error info for empty swap list; rm hdf5 debug codes

3921641

philip-davis requested changes Sep 19, 2024

View reviewed changes

bozhang-hpc added 9 commits September 27, 2024 14:10

move all the bbox functions for swap to bbox.h & bbox.c

26cb0c6

fix the curl find_package(); conditionally required HDF5 & NetCDF; ad…

ca59ef7

…d FILEBACKEND option for swap

conditionally enable file swap & HDF5

68dac37

refactor the swap conf settings

af34945

add an abstraction for file swap backend; conditionally include file …

6748f5a

…swap headers; add find_package for NetCDF

switch memory value type to an enum

3a959c8

fix the potential fatal exit(); use uint64_t for memory values

e79e1b2

add a default file backend setting into the swap_conf and adpat all o…

ed4e692

…ther functions to it

remove swap dir cleanup

d84a86e

philip-davis reviewed Sep 28, 2024

View reviewed changes

bozhang-hpc added 4 commits September 29, 2024 16:00

fix conditional compilation for od swap in put()/get(); make the file…

407a6b9

… backend not user-specified

fix od allocation when there is no file backend

e72bc53

more fix on conditional compilation

caf0eee

add set_default_swap() to non toml conf setup

ab2df4a

fix the declaration order

d8a8703

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feat/file #54

Feat/file #54

bozhang-hpc commented Sep 12, 2024 •

edited

Loading

philip-davis commented Sep 19, 2024

philip-davis left a comment

philip-davis Sep 19, 2024

bozhang-hpc Sep 26, 2024

philip-davis Sep 26, 2024

philip-davis Sep 28, 2024

Feat/file #54

Are you sure you want to change the base?

Feat/file #54

Conversation

bozhang-hpc commented Sep 12, 2024 • edited Loading

philip-davis commented Sep 19, 2024

philip-davis left a comment

Choose a reason for hiding this comment

philip-davis Sep 19, 2024

Choose a reason for hiding this comment

bozhang-hpc Sep 26, 2024

Choose a reason for hiding this comment

philip-davis Sep 26, 2024

Choose a reason for hiding this comment

philip-davis Sep 28, 2024

Choose a reason for hiding this comment

bozhang-hpc commented Sep 12, 2024 •

edited

Loading