-
Notifications
You must be signed in to change notification settings - Fork 105
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
In memory mode #4012
In memory mode #4012
Conversation
abdccb1
to
ea4bd38
Compare
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #4012 +/- ##
==========================================
- Coverage 85.16% 85.11% -0.05%
==========================================
Files 1297 1297
Lines 50536 50597 +61
Branches 6956 6977 +21
==========================================
+ Hits 43037 43067 +30
- Misses 7368 7388 +20
- Partials 131 142 +11 ☔ View full report in Codecov by Sentry. |
Benchmark ResultMaster commit hash:
|
Benchmark ResultMaster commit hash:
|
Benchmark ResultMaster commit hash:
|
@@ -33,6 +34,11 @@ class DBFileUtils { | |||
public: | |||
constexpr static common::page_idx_t NULL_PAGE_IDX = common::INVALID_PAGE_IDX; | |||
|
|||
static uint8_t* pinPage(BMFileHandle& fileHandle, common::page_idx_t pageIdx, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a little concerned that this adds new ways of doing pinning/optimistic reads without making it obvious what the differences are at a glance.
Maybe BufferManager::pin
/BufferManager::optimisticRead
should be private, or maybe we shouldn't even be passing around the BufferManager at all for these file operations. The BMFileHandle already stores a pointer to the BufferManager, so we could have a BMFileHandle::pin(page_idx_t, PageReadPolicy)
function instead that handles the memory mode. While that doesn't really make the differences any clearer, it at least would make the BMFileHandle functions the more obvious choice, and we could remove the BufferManager arguments from DBFileUtils and avoid passing it around as much and make the BufferManager more of an internal detail that you shouldn't need to interact with when working with file handles (though #3743 will still mean that the MemoryManager needs to get passed around, but with this removed maybe we can even remove MemoryManager::getBufferManager
).
const auto frame = dataFH->getBM()->getFrame(*dataFH, startPageIdx); | ||
memcpy(frame, buffer, bufferSize); | ||
} else { | ||
dataFH->getFileInfo()->writeFile(buffer, bufferSize, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we should move away from exposing the FileInfo through the file handle and add a write function that more or less is equivalent to this block (i.e. memcpy in in-memory-mode and use the FileInfo otherwise).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Separating this to another PR for refactoring.
Benchmark ResultMaster commit hash:
|
@@ -32,63 +32,59 @@ class PageState { | |||
|
|||
PageState() { stateAndVersion.store(EVICTED << NUM_BITS_TO_SHIFT_FOR_STATE); } | |||
|
|||
inline uint64_t getState() const { return getState(stateAndVersion.load()); } | |||
inline static uint64_t getState(uint64_t stateAndVersion) { | |||
uint64_t getState() const { return getState(stateAndVersion.load()); } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should probably align this uint64_t
with a typedef
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will do this in a separate refactoring PR on BMFileHandle along with this comment #4012 (comment).
class WALReplayer; | ||
class WAL { | ||
friend class WALReplayer; | ||
|
||
public: | ||
WAL(const std::string& directory, bool readOnly, BufferManager& bufferManager, | ||
common::VirtualFileSystem* vfs, main::ClientContext* context); | ||
WAL(const std::string& directory, bool readOnly, common::VirtualFileSystem* vfs, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if u are passing around clientContext, u probably don't need to pass around vfs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is something I need to discuss with @acquamarin regarding to how vfs should make use of ClientContext. We are passing both in several different places, which should refactored altogether if possible.
@@ -118,7 +119,7 @@ void Database::addExtensionOption(std::string name, LogicalTypeID type, Value de | |||
extensionOptions->addExtensionOption(name, type, std::move(defaultValue)); | |||
} | |||
|
|||
ExtensionOption* Database::getExtensionOption(std::string name) { | |||
ExtensionOption* Database::getExtensionOption(std::string name) const { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto
* draft: in memory mode * tools for in mem * fix * wip: adapt tests for in mem * skip reloaddb for in mem db; fixes; add SKIP_IN_MEM to testing framework * clean tests * add clang in mem ci workflow * Run clang-format * update tests; rework DBConfig::isDBPathInMemory * update api tests * update tests * bump extension version * add kuzu api * add ku_destroy in c api test * update tests * rework BMFileHandle interfaces * fix clang tidy; skip force checkpoint test for in mem mode * Run clang-format * fix clang tidy --------- Co-authored-by: CI Bot <[email protected]> (cherry picked from commit 6ab3e4b)
Description
Add the support for in-memory mode. #1816.
Feature
When the database path is omitted, set to empty or set to
:memory:
(follow duck's convention here), the database will be open under in-memory mode.The main differences between on-disk and in-memory mode are:
Under on-disk mode, all data will be persistent on disk. All transactions are logged in WAL, in which changes will be merged into database files during checkpoint.
While under the in-memory mode, there is no writes to WAL, no data is persistent to disk and
CHECKPOINT
will do nothing. All data are lost when the process finishes.Restrictions on in-memory mode:
Implementation
After MVCC changes, supporting in-mem mode becomes much easier.
New pages in BMFileHandle are pinned in BM and only unpinned when BMFileHandle gets destructed. Accesses to these pages directly grab frames from BM.
Benchmark
Did a few simple benchmarks on ldbc-100.
COPY
Read only queries
Insertions
1M node insertions.
TODO