Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ability to get subschemas of CombinedSchema in insertion order #519

Closed
wants to merge 1 commit into from

Conversation

rayokota
Copy link
Contributor

@rayokota rayokota commented Feb 1, 2025

Add the ability to get subschemas of CombinedSchema in insertion order.

For context, the code in Confluent Schema Registry is currently using version 1.14.3 of this library. We found that when we upgraded to 1.14.4 some of our code no longer worked because we were relying on the order of schemas returned from CombinedSchema.getSubschemas(). That order changed in 1.14.4 due to #498. It turns out the order was not deterministic anyway for the leading subschemas, but we hadn't yet hit this problem. But now while the order is deterministic for any one specific JVM execution, it may be different between different executions and different JVMs because it uses the hashCode method to sort the subschemas.

Our code works best with the insertion order, which is deterministic (across executions and JVMs) for a given CombinedSchema. Therefore a modest workaround that works for us is to add a CombinedSchema.getOrderedSubschemas method and for the Confluent code to rely on that method instead of CombinedSchema.getSubschemas.

@erosb
Copy link
Contributor

erosb commented Feb 1, 2025

Hello @rayokota I thought you already migrated to erosb/json-sKema, didn't you?

@rayokota
Copy link
Contributor Author

rayokota commented Feb 1, 2025

Hi @erosb , we still use this library for the older drafts, and also for the object model it provides.

@erosb
Copy link
Contributor

erosb commented Feb 2, 2025

Hello @rayokota does this issue have an originator issue in the schema registry GH repo? Or can you point me to any source where the current behavior causes a problem?

@rayokota
Copy link
Contributor Author

rayokota commented Feb 2, 2025

Thanks @erosb , I've filed confluentinc/schema-registry#3536 to track. I'm happy to answer any further questions you might have.

@rayokota
Copy link
Contributor Author

rayokota commented Feb 4, 2025

Hi @erosb , do you have any more feedback/questions on this PR?

@erosb
Copy link
Contributor

erosb commented Feb 5, 2025

Hello @rayokota I'm thinking about if it is possible to fix this issue without introducing an other public method on CombinedSchema. I pushed a commit into a separate branch, trying to implement a comparator which remembers the original ordering of subschemas, but so far it doesn't work (the CombinedSchemaTest#subschemasSort() fails on this change).

@rayokota
Copy link
Contributor Author

rayokota commented Feb 5, 2025

Thanks for looking into this @erosb ! I really appreciate it.

Anyway, I have an alternative PR that does not introduce another public method, but simply restores the insertion order (which used to be the original behavior): #522

erosb pushed a commit that referenced this pull request Feb 6, 2025
Maintain insertion order for subschemas of CombinedSchema.

This is an alternative to #519 that maintains the subschemas in insertion order.

The subschemas were originally maintained in insertion order, but it was changed due to #405, which ensured that subschemas that were of type CombinedSchema were visited first during validation, and by #498, which tries to maintain a stable order for "equivalent" instances.

This PR simply restores the insertion order, but maintains an internally sorted collection for the two purposes listed above, namely visiting subschemas during validation and equivalence checks during equals/hashCode.
@erosb erosb closed this Feb 6, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants