-
Notifications
You must be signed in to change notification settings - Fork 10
Keyspace Object Mapper (KOM) #147
Comments
The kind parameter allows different serialization/deseriazation types. This can be implemented by created a _dumps method self._dumps = json.dumps or the like |
the datify function in keri.help.helping converts a dict back into a dataclass instance def datify(cls, d):
"""
Returns instance of dataclass cls converted from dict d
Parameters:
cls is dataclass class
d is dict
"""
try:
fieldtypes = {f.name: f.type for f in dataclasses.fields(cls)}
return cls(**{f: datify(fieldtypes[f], d[f]) for f in d}) # recursive
except:
return d # Not a dataclass field
ps = helping.datify(PreSit, json.loads(bytes(rawsit).decode("utf-8"))) This provides the deserialization side that would be used by the Komer.get method. |
The work item is to flesh out the .get and .del methods and to support JSON and MGPK, and maybe CBOR, and Pickle |
I've started this, |
I just pushed the skeleton code above to the keripy. There were some subtleties in the python code that I fixed that might have misled you. Anyway the incomplete Komer object in in keri.base.basing and a basic test skeleton in tests.base.test_basing But what there pasts the tests so it shouldn't be misleading any more. What I wrote above was just a sketch |
So Sophy does a couple of other things like support slicing which could be added to the Komer object. Using python dataclasses is more powerful expressive approach to schema than the approach Sophy used which requires defining custom classes and the schema of those classes is not apparent anywhere as it is with a dataclass definition. Dataclasses give us a lot of power down the road, For example with dataclasses.make_dataclass() we could create a dataclass definition from a string that is parsed into tuples and then passed into make_dataclass(). We may never need to do that for our internal apps but it would allow some declarative coding for customization down the road. |
The json serializer could be made more compact to get rid of white space and use utf-8 instead of escaping when non-ascii. >>> d = dict(a=1, b=2, c=3)
>>> json.dumps(d)
'{"a": 1, "b": 2, "c": 3}'
>>> json.dumps(d, separators=(",", ":"), ensure_ascii=False)
'{"a":1,"b":2,"c":3}' |
LMDB Keyspace Python Object Mapper (KOM)
The current keripy use of LMDB has been mostly limited to KELs for the KERI Core. However in developing applications that sit on top of the KERI core, it is useful to have a more generic CRUD like database interface. LMDB is a lexicographically ordered key value store (arguably the most performant of this class). The Python wrapper over the LMDB c language implementation is very low level and does provide any syntactic sugar to make it easy to map python object instances to values in the database. This is a proposed design for a Python factory class that maps Python dataclass instances as serialized values to entries in the key space of of given LMDB database. Hence for short in this proposal we are calling it a Key space Object Mapper or (KOM) (or Key-value-store Object Mapper).
Python dataclasses as combined schema and instances for serializable data
Support for Python data classes was added in Python 3.7. Essentially a dataclass provides a convenient way of creating instances according to a data schema. The data schema uses the Python type hints in the class definition. This provides a very convenient and compact initialization method that does not require any external schema to declare the attribute types in the subclass. The subclass definition serves as both schema definition and attribute declaration.
For example keripy currently uses data classes in the keri.base.keeping module to define schema for serializing information about (public, private) key pairs for managing those keys in an LMDB key store. Here is the PubLot class definition:
The
@dataclass
decorator converts the class definition into a compliant subclass with init method etc. It provides syntactic suger that translates the unique syntax in the class definition to a python subclass with and init method that creates instance attributes for each declared field. The dataclass definition syntax is essentially a schema definition for members of that class.To instantiate a PubLot just call the class with arguments corresponding to the defined fields. For example:
To serialize use the dataclass class method dataclass.asdict() to convert to a dictionary that may be serialized with JSON,MsgPack, or CBOR.
Python data classes by have nested schema definitions. For example the keri.base.keeping.PreSit dataclass has fields that are PubLot dataclass instances :
A PreSit instance may be initialized and serialized the same way.
KOM
A notional KOM class is a database factory that creates instances of CRUD database mappers. Each instance has a defined database schema for the entries in that database. The schema is expresses as a Python dataclass. The database mapper handles the serialization and deserialization of database instances and hides those details behind its CRUD like protocol of methods, namely, `.get, .put, .del).
Each KOM instance with unique schema dataclass has its own LMDB sub database or subdb. This is a feature of LMDB environments that allows partitioning of its key space. Each LMDB environment defines a master database that includes its whole key space. Subdatabases essentially are prefixes to keys in the key space that partition the master database key space. This enables each subdatabase to be treated like a table of a unique type.
For example:
The text was updated successfully, but these errors were encountered: