Skip to content

Commit

Permalink
implement databox serialization and de-serialization
Browse files Browse the repository at this point in the history
  • Loading branch information
jonahm-LANL committed Aug 12, 2024
1 parent 3519016 commit 06cf9d1
Show file tree
Hide file tree
Showing 3 changed files with 198 additions and 16 deletions.
74 changes: 69 additions & 5 deletions doc/sphinx/src/databox.rst
Original file line number Diff line number Diff line change
Expand Up @@ -107,7 +107,7 @@ yourself. For example:
You can also resize a ``DataBox``, which you can use to modify a
``DataBox`` in-place. For example:

.. code-block::
.. code-block:: cpp
Spiner::DataBox<double> db; // empty
// clears old memory, resizes the underlying array,
Expand All @@ -124,7 +124,7 @@ If you want to change the stride without changing the underlying data,
you can use ``reshape``, which modifies the dimensions of the
array, without modifying the underlying memory. For example:

.. code-block::
.. code-block:: cpp
// allocate a 1D databox
Spiner::DataBox<double> db(nx3*nx2*nx1);
Expand Down Expand Up @@ -170,7 +170,7 @@ Semantics and Memory Management
``DataBox`` has reference semantics---meaning that copying a
``DataBox`` does not copy the underlying data. In other words,

.. code-block::
.. code-block:: cpp
Spiner::DataBox<double> db1(size);
Spiner::DataBox<double> db2 = db1;
Expand Down Expand Up @@ -230,7 +230,7 @@ call ``free`` for you, so long as you use them with a custom
deleter. Spiner provides the following deleter for use in this
scenario:

.. code-block::
.. code-block:: cpp
struct DBDeleter {
template <typename T>
Expand All @@ -242,7 +242,7 @@ scenario:
It can be used, for example, with a ``std::unique_ptr`` via:

.. code-block::
.. code-block:: cpp
// needed for smart pointers
#include <memory>
Expand All @@ -259,6 +259,70 @@ It can be used, for example, with a ``std::unique_ptr`` via:
// when you leave scope, the data box will be freed.
Serialization and de-serialization
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Shared memory models, such as `MPI Windows`_, require allocation of
memory through an external API call (e.g.,
``MPI_Win_allocate_shared``), which tabulated data must be written
to. ``Spiner`` supports this model through **serialization** and
**de-serialization**. The relevant methods are as follows. The
function

.. cpp:function:: std::size_t DataBox::serializedSizeInBytes() const;

reports how much memory a ``DataBox`` object requires to be externally
allocated. The function

.. cpp:function:: std::size_t serialize(char *dst) const;

takes a ``char*`` pointer, assumed to contain enough space for a
``DataBox``, and stores all information needed for the ``DataBox`` to
reconstruct itself. The return value is the amount of memory in bytes
used in the array by the serialized ``DataBox`` object. This method is
non-destructive; the original ``DataBox`` is unchanged. The function

.. cpp:function:: std::size_t DataBox::deSerialize(char *src);

initializes a ``DataBox`` to match the serialized ``DataBox``
contained in the ``src`` pointer.

.. note::

Note that the de-serialized ``DataBox`` has **unmanaged** memory, as
it is assumed that the ``src`` pointer manages its memory for
it. Therefore, one **cannot** ``free`` the ``src`` pointer until
everything you want to do with the de-serialized ``DataBox`` is
over.

Putting this all together, an application of
serialization/de-serialization probably looks like this:

.. code-block:: cpp
// load a databox from, e.g., file
Spiner::DataBox<double> db;
db.loadHDF(filename);
// get size of databox
std::size_t allocate_size = db.serialSizeInBytes();
// Allocate the memory for the new databox.
// In practice this would be an API call for, e.g., shared memory
char *memory = (char*)malloc(allocate_size);
// serialize the old databox
std::size_t write_size = db.serialize(memory);
// make a new databox and de-serialize it
Spiner::DataBox<double> db2;
std::size_t read_size = db2.deSerialize(memory);
// read_size, write_size, and allocate_size should all be the same.
assert((read_size == write_size) && (write_size == allocate_size));
.. _`MPI Windows`: https://www.mpi-forum.org/docs/mpi-4.1/mpi41-report/node311.htm

Accessing Elements of a ``DataBox``
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Expand Down
40 changes: 40 additions & 0 deletions spiner/databox.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -305,6 +305,46 @@ class DataBox {
return indices_[i];
}

// serialization routines
// ------------------------------------
// this one reports size for serialize/deserialize
std::size_t serializedSizeInBytes() const {
return sizeBytes() + sizeof(*this);
}
// this one takes the pointer `dst`, which is assumed to have
// sufficient memory allocated, and fills it with the
// databox. Return value is the amount of bytes written to.
std::size_t serialize(char *dst) const {
PORTABLE_REQUIRE(status_ != DataStatus::AllocatedDevice,
"Serialization cannot be performed on device memory");
memcpy(dst, this, sizeof(*this));
std::size_t offst = sizeof(*this);
if (sizeBytes() > 0) { // could also do data_ != nullptr
memcpy(dst + offst, data_, sizeBytes());
offst += sizeBytes();
}
return offst;
}
// This one takes a src pointer, which is assumed to contain a
// databox and initializes the current databox. Note that the
// databox becomes unmananged, as the contents of the box are still
// the externally managed pointer.
std::size_t deSerialize(char *src) {
PORTABLE_REQUIRE(
(status_ == DataStatus::Empty || status_ == DataStatus::Unmanaged),
"Must not de-serialize into an active databox.");
memcpy(this, src, sizeof(*this));
std::size_t offst = sizeof(*this);
// now sizeBytes is well defined after copying the "header" of the source.
if (sizeBytes() > 0) { // could also do data_ != nullptr
data_ = (double *)(src + offst);
status_ = DataStatus::Unmanaged;
offst += sizeBytes();
}
return offst;
}
// ------------------------------------

DataBox<T, Grid_t, Concept>
getOnDevice() const { // getOnDevice is always a deep copy
if (size() == 0 ||
Expand Down
100 changes: 89 additions & 11 deletions test/test.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -53,8 +53,8 @@ PORTABLE_INLINE_FUNCTION Real linearFunction(Real b, Real a, Real z, Real y,
return x + y + z + a + b;
}

TEST_CASE("PortableMDArrays can be allocated from a pointer",
"[PortableMDArray]") {
SCENARIO("PortableMDArrays can be allocated from a pointer",
"[PortableMDArray]") {
constexpr int N = 2;
constexpr int M = 3;
std::vector<int> data(N * M);
Expand Down Expand Up @@ -529,19 +529,22 @@ TEST_CASE("DataBox Interpolation with piecewise grids",

WHEN("We construct and fill a 3D DataBox based on this grid") {
constexpr int RANK = 3;
PiecewiseDB<NGRIDS> db(Spiner::AllocationTarget::Device, NCOARSE, NCOARSE,
NCOARSE);
PiecewiseDB<NGRIDS> dbh(Spiner::AllocationTarget::Host, NCOARSE, NCOARSE,
NCOARSE);
for (int i = 0; i < RANK; ++i) {
db.setRange(i, g);
dbh.setRange(i, g);
}
portableFor(
"Fill 3D Databox", 0, NCOARSE, 0, NCOARSE, 0, NCOARSE,
PORTABLE_LAMBDA(const int iz, const int iy, const int ix) {
for (int iz = 0; iz < NCOARSE; ++iz) {
for (int iy = 0; iy < NCOARSE; ++iy) {
for (int ix = 0; ix < NCOARSE; ++ix) {
Real x = g.x(ix);
Real y = g.x(iy);
Real z = g.x(iz);
db(iz, iy, ix) = linearFunction(z, y, x);
});
dbh(iz, iy, ix) = linearFunction(z, y, x);
}
}
}
auto db = dbh.getOnDevice();

THEN("We can interpolate it to a finer grid and get the right answer") {
Real error = 0;
Expand All @@ -561,8 +564,83 @@ TEST_CASE("DataBox Interpolation with piecewise grids",
error);
REQUIRE(error <= EPSTEST);
}

// cleanup
free(db);
free(dbh);
}
}
}

SCENARIO("Serializing and deserializing a DataBox",
"[DataBox][PiecewiseGrid1D][Serialize]") {
GIVEN("A piecewise grid") {
constexpr int NGRIDS = 2;
constexpr Real xmin = 0;
constexpr Real xmax = 1;

RegularGrid1D g1(xmin, 0.35 * (xmax - xmin), 3);
RegularGrid1D g2(0.35 * (xmax - xmin), xmax, 4);
PiecewiseGrid1D<NGRIDS> g = {{g1, g2}};

const int NCOARSE = g.nPoints();

THEN("The piecewise grid contains a number of points equal the sum of "
"the points of the individual grids") {
REQUIRE(g.nPoints() == g1.nPoints() + g2.nPoints());
}

WHEN("We construct and fill a 3D DataBox based on this grid") {
constexpr int RANK = 3;
PiecewiseDB<NGRIDS> dbh(Spiner::AllocationTarget::Host, NCOARSE, NCOARSE,
NCOARSE);
for (int i = 0; i < RANK; ++i) {
dbh.setRange(i, g);
}
for (int iz = 0; iz < NCOARSE; ++iz) {
for (int iy = 0; iy < NCOARSE; ++iy) {
for (int ix = 0; ix < NCOARSE; ++ix) {
Real x = g.x(ix);
Real y = g.x(iy);
Real z = g.x(iz);
dbh(iz, iy, ix) = linearFunction(z, y, x);
}
}
}
WHEN("We serialize the DataBox") {
std::size_t serial_size = dbh.serializedSizeInBytes();
REQUIRE(serial_size == (sizeof(dbh) + dbh.sizeBytes()));

char *db_serial = (char *)malloc(serial_size * sizeof(char));
std::size_t write_offst = dbh.serialize(db_serial);
REQUIRE(write_offst == serial_size);

THEN("We can initialize a new databox based on the serialized one") {
PiecewiseDB<NGRIDS> dbh2;
std::size_t read_offst = dbh2.deSerialize(db_serial);
REQUIRE(read_offst == write_offst);

AND_THEN("The shape is correct") {
REQUIRE(dbh2.rank() == dbh.rank());
REQUIRE(dbh2.size() == dbh.size());
for (int d = 1; d <= 3; ++d) {
REQUIRE(dbh2.dim(d) == dbh.dim(d));
}
}

AND_THEN("The contents are correct") {
for (int i = 0; i < dbh.size(); ++i) {
REQUIRE(dbh(i) == dbh2(i));
}
}
}

// cleanup
free(db_serial);
}

// cleanup
free(dbh);
}
}
}
Expand Down Expand Up @@ -702,7 +780,7 @@ SCENARIO("Using unique pointers to garbage collect DataBox",
}

#if SPINER_USE_HDF
TEST_CASE("PiecewiseGrid HDF5", "[PiecewiseGrid1D][HDF5]") {
SCENARIO("PiecewiseGrid HDF5", "[PiecewiseGrid1D][HDF5]") {
GIVEN("A piecewise grid") {
RegularGrid1D g1(0, 0.25, 3);
RegularGrid1D g2(0.25, 0.75, 11);
Expand Down

0 comments on commit 06cf9d1

Please sign in to comment.