Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Order of meta JSON attributes changed on persisting #2621

Closed
jonatanschneider opened this issue Jan 18, 2023 · 5 comments
Closed

Order of meta JSON attributes changed on persisting #2621

jonatanschneider opened this issue Jan 18, 2023 · 5 comments

Comments

@jonatanschneider
Copy link

Hi,
I'm currently using npgsql with the entity framework core adapter to persist data into a JSONB column in a postgres database.
For my use-case I need polymorphic (de-)serialization. As of .NET 7 System.Text.Json introduced type discriminators for this purpose.

The problem:
The persisted object in the database is different from the output System.Text.Json produces when directly calling the Serializer. In more detail, meta-elements are not the first attributes anymore, which leads to an exception on deserialization.

Consider the following example:

Calling the JsonSerializer directly with Serialize produces the following output:

{
  "$id": "1",
  "$type": "base",
  "items": {
    "$id": "2",
    "$values": [
      {
        "$id": "3",
        "$type": "child1",
        "value1": "Test",
        "items": null,
        "test": 0
      }
    ]
  },
  "test": 0
}

Take note that $id and $type are always the first elements in an object.
However after calling SaveChanges, the following data is stored in the database:

{
  "$id": "1",
  "$values": [
    {
      "$id": "2",
      "test": 0,
      "$type": "child1",
      "items": null,
      "value1": "Test"
    }
  ]
}

As you can see the inner object of "Child1" has a different order of the meta-elements. This leads (of course) to an error on deserialization:

Unhandled exception. System.Text.Json.JsonException: The metadata property is either not supported by the type or is not the first property in the deserialized JSON object. Path: $[0].$type

Here is my code, used for this example:

The model classes:

[JsonDerivedType(typeof(Parent), typeDiscriminator: "base")]
[JsonDerivedType(typeof(Child1), typeDiscriminator: "child1")]

[Table("parent")]
public class Parent
{
    [Column(TypeName = "jsonb")]
    public List<Parent> items { get; set; }
    
    [Key] public int test { get; set; }
}

public class Child1: Parent
{
    public string value1 { get; set; }
}

Database setup

public class Database: DbContext
{
    public DbSet<Parent> Parent { get; set; }

    protected override void OnConfiguring(DbContextOptionsBuilder optionsBuilder)
    {
        optionsBuilder.UseNpgsql(
            @"Host=[...]");
    }

    protected override void OnModelCreating(ModelBuilder modelBuilder)
    {
        modelBuilder.Entity<Parent>()
            .Property(g => g.items)
            .HasColumnType("jsonb");
    }
}

For additional needed configuration of the System.Text.Json serializer I use the workaround as described in #1107 (comment)

public static void Main(string[] args)
    {
        var options = new JsonSerializerOptions
         {
             ReferenceHandler = ReferenceHandler.Preserve,
         };
        NpgsqlConnection.GlobalTypeMapper.AddTypeResolverFactory(new JsonOverrideTypeHandlerResolverFactory(options));
       

        var parent = new Parent();
        var child1 = new Child1();
        child1.value1 = "Test";
        
        parent.items = new List<Parent> { child1 };

        Console.WriteLine(System.Text.Json.JsonSerializer.Serialize(parent, options));
        
        using var ctx = new Database();
        ctx.Parent.Add(parent);
        ctx.SaveChanges();
        
        var result = Task.Run(async () =>
        {
            await using var ctx = new Database();
            return await ctx.Parent.ToListAsync();
        }).GetAwaiter().GetResult();
    }

Is this a known bug or did I configure something wrong?

@roji
Copy link
Member

roji commented Jan 18, 2023

@jonatanschneider that's expected, and is how the PostgreSQL jsonb type works; it's an efficient binary representation that explicitly doesn't conserve dictionary ordering. This is in contract to the json type, which persists the JSON document as a string, so ordering (and IIRC even whitespace) is preserved, but it less efficient in various ways. See the PG docs on this.

It's generally not recommended to rely on the ordering of object keys in JSON.

@jonatanschneider
Copy link
Author

@roji Thanks for the quick answer.

However, System.Text.Json relies on the specific order: https://learn.microsoft.com/en-us/dotnet/standard/serialization/system-text-json/polymorphism?pivots=dotnet-7-0
"The type discriminator must be placed at the start of the JSON object, grouped together with other metadata properties like $id and $ref."

In my short tests with Newtonsoft, I could observe the same behaviour, only that they don't crash but rather ignore all type values in such a case.

Therefore my only option is to use a JSON (de-)serializer which provides a read-ahead logic (as Newtonsoft does with an extra setting), am I correct?

@roji
Copy link
Member

roji commented Jan 18, 2023

Thanks for pointing that out - I wasn't aware this was added for 7.0.

This problem seems to be tracked on the System.Text.Json side in dotnet/runtime#72604 - I there are some workarounds listed there which may be workable for you. I hope you they do fix this properly though.

Note that you can still use the json type (as opposed to jsonb), though be aware of the perf implications.

I'm going to go ahead and close this as I don't think there's anything Npgsql- or EF- related, but I'd be happy to continue the conversation so don't hesitate to post back here.

@roji roji closed this as not planned Won't fix, can't repro, duplicate, stale Jan 18, 2023
@springy76
Copy link

Since retrieving data using npgsql already is entirely buffered (per row) you can switch easily to an intermediary JsonNode and fix the ordering there first, before deserializing the node to final objects.

public static JsonNode? EnsureTypeDiscriminatorIsFirstProperty(JsonNode? node, string discriminatorName = "$type")
{
	switch ( node )
	{
		case JsonArray array:
		{
			foreach ( var subNode in array )
				EnsureTypeDiscriminatorIsFirstProperty(subNode, discriminatorName);
			break;
		}
		case JsonObject obj:
		{
			if ( obj.Count == 0 || obj.First().Key == discriminatorName )
				break;

			var sorted = obj.OrderBy(kv => kv.Key != discriminatorName).ToList();
			obj.Clear();

			foreach ( var (propName, propNode) in sorted )
			{
				if ( propNode != null )
					EnsureTypeDiscriminatorIsFirstProperty(propNode, discriminatorName);
				obj.Add(propName, propNode);
			}

			break;
		}
	}

	return node;
}

@roji
Copy link
Member

roji commented Jan 27, 2023

Good point. That does introduce an additional perf overhead, but it's certainly a workaround.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants