-
-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Jackson databind changes for draft v4 schema generator #838
Conversation
|
||
protected String idFromClass(Class<?> clazz) | ||
{ | ||
if(clazz==null){ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would this actually work? Caller assumes value
is not null by calling value.getClass()
? Is this method called directly with null from some place?
Alternatively call above could pass null for null instance.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fair point albeit it is backward compatible :-) Previous impl:
@Override
public String idFromValue(Object value)
{
Class<?> cls = _typeFactory.constructType(value.getClass()).getRawClass();
Please let me know if you want this changed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct, I am just trying to understand what is the intended change -- not saying it breaks anything that wasn't already broken. But I guess the answer is seen below... so never mind.
I am ok without changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Apologise if I was not clear enough, let me try to explain.
As you are very well aware when Jackson generates JSON documents out of java objects the generated JSON document itself might not have enough type information to turn that JSON document back to a Java object. In that case an optional Typer can be provided which will enrich the JSON document with additional type information so the deserializer knows which class to instantiate.
Now from a schema generator point of view if the schema generator does not take this additional meta-data into consideration than the JSON document will no longer be valid against the generated schema. To solve this problem the schema generator will need to know exactly how Jackson will generate JSON. To avoid duplication I am allowing the same typer to be provided to the schema mapper as what the original mapper was using. Now when in schema generation mode I do not have access to any instances of the class for which schema is being generated, I purely need to operate on the class itself. Without this change I won't be able to extract the typeId used. The added test case shows how the generator will use this to decide what type of metadata is added and if the typeId known it will also restrict the allowed values to it.
Please let me know if you need further information.
Zoltan
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand the need, from perspective that while traversal is based on serialization, in many aspects it really relates more to deserialization because there is no access to instances just types.
I am just trying to see how pieces fit together: looking at code I can see how it relates to JsonSerializer
, which is the entrypoint. I didn't think visitors exposed information directly.
Hi, it looks like my follow-up commits are also part of this pull request (github newbie here...). These are all needed for the v4 generator, let me explain what are those: I) Support for array typed JSON typeIDs {
"type": [
"string",
"number",
"integer",
"boolean",
"object",
"array",
"null"
]
} This is causing issues with the schema parser. The schema parser uses the type property as typeId but the current jackson parser assumes the typeId is a String and can't deal with arrays. To support it I had to change the AsPropertyTypeDeserializer to allow for String Arrays. My change also exposed a bug in the TokenBuffer. Unit tests for both features are attached. II) Expose SerializerFactory on SerializerProvider III) Unit test to ensure custom KeySerializer is indeed invoked when seria… Regards, |
I'm sorry but I don't think I want to change type handling to support arrays of types, just to support JSON Schema. Jackson itself wouldn't have use for union types, as Java has no way of expressing or using such. JSON Schema does define those, which I think is a mistake, but it is what it is. On I could check in some of the fixes separately; esp. one for |
Looking at KeySerializer test: testing wrt |
Turns out your test managed to reproduce #848, which I just fixed. Good catch! |
I) Support for arrays of type @JsonTypeName("Number")
@JsonSubTypes({@JsonSubTypes.Type(Integer.class),
@JsonSubTypes.Type(Double.class)})
interface NumberType{
}
@JsonTypeName("Double")
interface DoubleType{
}
@JsonTypeName("Integer")
interface IntegerType{
} With the following mappings ObjectMapper mapper = new ObjectMapper();
mapper.addMixIn(Number.class,UnionType.class);
mapper.addMixIn(Double.class,DoubleType.class);
mapper.addMixIn(Integer.class,IntegerType.class);
... would basically define any Number instance to be either a number type or an integer type. {
"type": [
"number",
"integer"
],
"definitions": {
"Integer": {
"type": "integer"
},
"Double": {
"type": "number"
}
},
"anyOf": [
{
"$ref": "#/definitions/Double"
},
{
"$ref": "#/definitions/Integer"
}
]
} (I understand number includes integer hence the above schema can be much simpler, but the point I am trying to make that usage of marker interfaces - topped with JSONSubType - will indeed yield JSON documents where the type of the document - and the corresponding java object - will depend on the actual message) II) On SerializerFactory /**
* Method called to create a type information serializer for given base type,
* if one is needed. If not needed (no polymorphic handling configured), should
* return null.
*
* @param baseType Declared type to use as the base type for type information serializer
*
* @return Type serializer to use for the base type, if one is needed; null if not.
*/
public abstract TypeSerializer createTypeSerializer(SerializationConfig config,
JavaType baseType)
throws JsonMappingException; Given these requirements what would be your recommendation? III) Key Serializer |
I don't think that creates union type, at least to my understanding of union types, although I am actually not sure I fully follow the example (I am guessing you are actually thinking Jackson does more work than it really does here, wrt mix-in annotations). But maybe that is bit irrelevant... what I am trying to figure out is just the question of what actual use for multiple type ids would be, for Java/JSON databinding, not including types that JSON Schema can express (and for which JSON may exist) but that do not have translation to Java. Like that "String or Array of Strings", which in Java world would just have to be more general "any java.lang.Object". But as to So.... it may be that more access needs to be given. Interestingly enough Long story short: I am fine with more access to get |
FWTW, I have a feeling that I do not quite understand the part about "typeId as JSON array". |
Hi, I) SerializerFactory II) typeId as JSON Array @JsonTypeName("UnionType")
@JsonTypeInfo(use = JsonTypeInfo.Id.NAME,include= JsonTypeInfo.As.WRAPPER_OBJECT)
@JsonSubTypes({@JsonSubTypes.Type(StringType.class),
@JsonSubTypes.Type(StringArrayType.class)})
public static abstract class UnionType {
@JsonCreator
public static StringType createStringType(String string){
return new StringType(string);
}
@JsonCreator
public static StringArrayType createStringArrayType(String[] stringArray){
return new StringArrayType(stringArray);
}
}
@JsonTypeInfo(use = JsonTypeInfo.Id.NAME,include= JsonTypeInfo.As.WRAPPER_OBJECT)
public static class StringType extends UnionType {
@JsonIgnore
private String myString;
@JsonCreator
public StringType(String myString) {
this.myString = myString;
}
@JsonValue
public String getMyString() {
return myString;
}
}
@JsonTypeInfo(use = JsonTypeInfo.Id.NAME,include= JsonTypeInfo.As.WRAPPER_OBJECT)
public static class StringArrayType extends UnionType {
@JsonCreator
public StringArrayType(String[] myStringArray) {
this.myStringArray = myStringArray;
}
@JsonIgnore
private String[] myStringArray;
@JsonValue
public String[] myStringArray() {
return myStringArray;
}
}
@org.junit.Test
public void testUnionType() throws IOException {
ObjectMapper mapper = new ObjectMapper();
UnionType myArray = new StringArrayType(new String[]{"5", "6"});
UnionType stringType = new StringType("4");
System.out.println(mapper.writerFor(UnionType.class).writeValueAsString(myArray));
System.out.println(mapper.writerFor(UnionType.class).writeValueAsString(stringType));
UnionType array = mapper.readValue("{\"UnionType\" : [\"5\",\"6,\"]}", UnionType.class);
UnionType string = mapper.readValue("{\"UnionType\" : \"5\"}", UnionType.class);
System.out.println("Array is instance of StringArrayType " + (array instanceof StringArrayType));
System.out.println("string is instance of StringType " + (string instanceof StringType));
} which yields
Going through this example I seem to have uncovered a bug: Assuming the following example would print the - what i think - correct result: {"UnionType":["5","6"]}
{"UnionType : "4"} than the following schema would apply: {
"type": "object",
"properties":{
"UnionType" : {
"type" : ["string","array"]
}
}
} If you take away the encoded type information the case would be even more obvious albeit that would be a one way street - deserialization would fail ... I do not want to argue for too long :) but this type id as array is a real blocker for me. If for not the above - which we can argue wether needs to be supported or not - the other use case I have - as I think I mentioned before - is the support for "any". The "any" keyword has been replaced by an array of types in v4 and my generator is basically stuck if it encounters Object or an interface - for which no type information is provided -. If supporting array of types by core jackson is not an option what would you recommend in here instead? Can we do something like custom typeId parser for which I can support my own implementation? (unfortunately my understanding of jackson internals are somewhat limited, so any guidance here would be highly appreciated!) Hope you understand the need here, once we have an agreement I'll create a new pull request with the changes discussed. |
Ok, sorry, I think we have mix up here: my earlier comments referred to Type Ids, for which only JSON String values are supported, and not arrays of Strings. Maybe I misread some part of examples, but it seemed like type id handler was expecting actual JSON Array values, and this would be problematic. As to array types, Jackson supports them although there are some limitations due to the fact that there is no matching explicit Type id of "String;" is indeed peculiar, and since no explicit type name was specific (either in Now: as to how to find all the type names; I think this is not something that should be bolted on API that does translation of individual type id to/from class, but rather an accessor that would give out mapping, if known. Whether that should be accessible via Still: there may be additional problem here. Whereas Jackson (and typical JSON representation) is more interested in how JSON data binds to polymorphic types, JSON Schema seems to focus more on JSON representation. Problem here is that it is possible to have various relationships between Java type and JSON structural type: for example, some date/time types may be serialized as either JSON Strings, JSON (integer) Numbers, or as JSON Arrays; but still be represented by a single Java type. |
"Ok, sorry, I think we have mix up here: my earlier comments referred to Type Ids, for which only JSON String values are supported, and not arrays of Strings. Maybe I misread some part of examples, but it seemed like type id handler was expecting actual JSON Array values, and this would be problematic." No mix up here, this is indeed about supporting array of String Type IDs. To create the Java representation of JSON schema the implementation is using the "type" property to map between json documents and Java classes: {"type":"string"} => StringSchema
{"type":"integer"} => IntegerSchema
{"type":"object"} => ObjectSchema the problem comes when the resulting JSON document can have more than one type, hence the resulting JSON schema needs to have a type of array's: {"type":["string","integer","object"]} => PolymorphicObjectSchema To achive the above mapping I am using a custom TypeId resolver(the same as what came with the v3 schema). The problem here is that the Json Parser assumes that the type-id is a simple string, hence my type id resolver will receive "[" to resolve against with the actual content of the "type" property lost. Without changing AsPropertyTypeDeserializer to parse the entire "type" property I can not deserialize my PolymorhipObjectSchema. One solution could be is to use a different property than "type" for the mapping, but in that case the v4 deserializer will not be able to parse legit schemas created by other applications. There is a few use case to create json schemas like that: {
"$ref": "#/definitions/___NUMBER_WTIH_NON_NUMERIC_VALUES___",
"definitions": {
"___NUMBER___": {
"id": "urn:jsonschema:___NUMBER___",
"type": "number"
},
"___NUMBER_WTIH_NON_NUMERIC_VALUES___": {
"id": "urn:jsonschema:___NUMBER_WTIH_NON_NUMERIC_VALUES___",
"type": [
"string",
"number"
],
"oneOf": [
{
"$ref": "#/definitions/___NUMBER___"
},
{
"$ref": "#/definitions/__ALLOWED_NON_NUMERIC_VALUES___"
}
]
},
"__ALLOWED_NON_NUMERIC_VALUES___": {
"id": "urn:jsonschema:__ALLOWED_NON_NUMERIC_VALUES___",
"type": "string",
"enum": [
"INF",
"-INF",
"-Infinity",
"Infinity",
"NaN"
]
}
}
} III) Java objects with multiple representations (think Joda time classes which can be serialized as strings or as longs) |
One more thing... I have the v4 generator working and tested on our quite large and complex configuration code base (JSON-based). From a contribution point of view this typeId resolution is the last missing piece (together with the aggreed change on SerializerFactory) and once this issue is resolved I can contribute it back ;-) |
Jackson's type identifier is not intended to enumerate possible types that a property might have, but rather specific type that instance document has. So it sounds like use of type id is not the right way to go about JSON Schema generation: type id included in JSON, passed by type id resolvers, is the specific type that value is created from, and intended to be bound back as. It is not meant to be set of possible types that the property might have. Question then is how to access this information. This should be accessible by resolvers, but not by trying to create virtual instances and asking for resolutions. Direct accessors are acceptable. |
Aha! Your comment made me found: @JsonTypeResolver where I can pass in my custom TypeResolverBuilder which in turn can return my custom TypeDeserializer :-). The only problem is that AsPropertyTypeDeserializer defines the following function as final: @SuppressWarnings("resource")
protected final Object _deserializeTypedForId(JsonParser jp, DeserializationContext ctxt, TokenBuffer tb) throws IOException If you aggree on relaxing that I should have everything I need :) Please let me know. |
Sure, I'll go ahead and make it non-final right away. |
Done. |
Thank you! I) Re Exposing SerializerFactory TypeSerializer typeSerializer = null;
try {
typeSerializer = this.provider.getSerializerFactory().createTypeSerializer(this.provider.getConfig(), originalType);
} catch (JsonMappingException e) {
//Can't get serializer, just return...
return TypeInfo.NOT_AVAILABLE;
} Are you ok to add a method to SerializerProvider which just delegates to the underlying SerializerFactory? If yes, let me know and I create a separate pull request II) V4 generator |
@zoliszel Yes, I think ability to do, say, On v4; I think there are two main approaches that would make sense to me:
I do think that it's better to have separate jars for the two, as I assume that users would typically migrate from one to the other. But also make sure that Java packages used are differently named, to allow use of both concurrently, if necessary. As the very first step, maybe you could just create a github repo for v4, work on that; and then we can see about merging projects. This way you would have full access from beginning and could release versions, regardless of how busy I am. Once projects are regrouped I will of course give you full access where necessary as well. |
Pull request for TypeNameIdResolver, please let me know if you need anything more.