Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Untyped serializer and complex types (class or struct) #586

Open
bellons91 opened this issue Jan 9, 2025 · 0 comments
Open

Untyped serializer and complex types (class or struct) #586

bellons91 opened this issue Jan 9, 2025 · 0 comments

Comments

@bellons91
Copy link

Issue description

I'm experimenting with this library to work with dynamic columns.

Using the untyped serializer, I'm trying to work with complex types.

I want one of the fields to represent a dummy structure, User. It can either be a struct or a class: it's not important as long as I can access its fields.

User as a Class

I tried to define User as a class

public class User {
     public string Name { get; set; }
}

I tried defining the ParquetSchema like this:

private readonly ParquetSchema schema = new (
     new DataField<string>("Customer")
     , new DataField<User>("User")
);

and like this:

private readonly ParquetSchema schema = new (
      new DataField<string>("Customer")
      , new DataField("User", typeof(User))
);

But, when I run the application, I get the following exception.

System.NotSupportedException: 'type Scripts.General.ParquetReadWriteWithDynamicFields+User is not supported'

User as a Struct

I then tried defining it as a struct:

public struct User {
    public string Name { get; set; }
}

Again, same exception:

System.NotSupportedException: 'type Scripts.General.ParquetReadWriteWithDynamicFields+User is not supported'

Another approach: use the StructField, as described here.

private readonly ParquetSchema schema = new (
  new DataField<string>("Customer")
  , new StructField("User", new DataField<string>("Name"))
);

Now the schema is well defined, but when serializing an object this way:

using (var fs = File.Create(FilePath))
{
    await ParquetSerializer.SerializeAsync(
        schema: schema,
        data: items
        destination: fs,
        options: options);
}

I get the following exception:

System.ApplicationException: 'failed to serialise data column 'User/Name
InvalidCastException: Unable to cast object of type 'User' to type 'System.Collections.Generic.IDictionary`2[System.String,System.Object]'.

Am I defining the schema in the wrong way? Or maybe, now that I'm serializing subtypes, I have to use the ParquetRowGroupWriter described in the documentation?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant