-
Notifications
You must be signed in to change notification settings - Fork 629
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Supporting protobuf preserving unknown fields #2655
Comments
@xiaozhikang0916 XML has this option. Both for tags and attributes (but they need to be stored separately). What you need is:
|
It is required for this feature, but I will keep it an internal interface, because I believe using protobuf in active code should be restricted in the scope of serializable class, and map-like structure ( with integer key of proto id ) is not dev-friendly. |
@xiaozhikang0916 You probably want some way to specify which numbers could be omitted (deprecated fields), or whether to write these values at all (changes in known attributes may change the validity of the opaque unsupported attributes). |
@pdvrieze I got your point, but IMO an option to omit, or ignore any fields ( deprecated or not ) is also a breach of the protobuf spec. If a deprecated field is not intented to be used by developers, just omit it in the declaration or keep the property internal/private. In anyway, this field should present in the wire date, and the wire data is not accessible to the user. |
@xiaozhikang0916 What I'm considering is the case where the data is read, processed and serialized. In some cases it is semantically valid to retain the (extra) data, in others it is not. It may be possible to make a copy of the container without the "unknown field" data to avoid serializing invalid data, but this may be challenging for deeply nested hierarchies. As a protocol the protobuf specification can not say anything of program semantics. The semantics are per definition program specific, only the developers of the program can decide what the correct way is to handle unknown data. The protobuf specification only states that fields are preserved during parsing and serialization, this is not about reading in data, doing some application specific stuff and then saving it again, rather it is to do with transformation of data through different paths. As to the deprecated fields, an example there is that you have a deprecated localtime field, as well as a current utcTime field. In the example the localTime is no longer present in the data type. When then the utcTime is updated the localTime is no longer valid (the time changed), but because the application no longer knows about localtime it cannot update this "unknown field". The only correct behaviour in that case is to not emit unknown (but possibly incorrect) fields. In other cases, say you want to add security headers to a message, it is perfectly valid to just encapsulate an opaque message and preserve all fields, it depends on the context. What I'm suggesting is a way (either using annotations or as a format option) to configure the behaviour. |
This is kind of critical, and should also work for JSON. Most JSON libraries, have first-class support for this (e.g. in .NET) |
@Flavien The challenge here is that the serialization library (and its formats) is designed to support serialization and deserialization (like most serialization libraries there is a bias towards the code side, not the wire format, although the bias is small). |
It is pretty good to have such feature in more formats, but it needs efforts on every formats. And if it will be implemented in multiple formats e.g. xml (already have), protobuf (planned in this issue) and json (requested in #2698 and #2701), it is worthy to have some support in the base interface defined in core lib. |
@xiaozhikang0916 In principle library support is good, but in this case the only element I can really see is to provide an annotation. The problem is that the actual data type that needs to be used is format dependent, so I don't think this can be supported well in a format/agnostic way (if a generic supertype is used then it can't be encoded - especially by a different format) |
I would like to support preserving unknown fields for ProtoBuf, here is an issue for discussion beforehand.
Motivation
Basically it is required in protobuf spec, so it needs to be implemented to make
kotlinx.serialization
a valid protobuf parser.Basic design
A new helper class will be introduced
Users can add to class property according to their needs:
Behaviours
UnknownFieldsHolder
when deserializing;UnknownFieldsHolder
should be written back to wire as is during serializing;oneof
field, the new value will be stored inUnknownFieldsHolder
, leaving the @ProtoOneOf property to be null;Further more
Enumeration
Enums in protobuf can also have
unknown values
in old version of code, but it is much more easier to store it in sealed interface / objects styles:Other formats
Do other formats need similar functionality?
The text was updated successfully, but these errors were encountered: