Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[fn][compute] Add support for Duration data type #28

Open
aymkhalil opened this issue Oct 5, 2022 · 8 comments
Open

[fn][compute] Add support for Duration data type #28

aymkhalil opened this issue Oct 5, 2022 · 8 comments

Comments

@aymkhalil
Copy link
Collaborator

aymkhalil commented Oct 5, 2022

Support ISO 8601 Duration type: https://en.m.wikipedia.org/wiki/ISO_8601#Durations

In Avro, this is supported as built in logical type: https://avro.apache.org/docs/1.10.2/spec.html#Duration

Few notes:

  • Avro's durations accepts months, days and milliseconds.
  • java.util.Duration accepts the ISO format PnDTnHnMn.nS (notice no months or years)
  • java.util.Period accepts the ISO format PnYnMnD (notice no hours, min, seconds, etc)

So if we wanna support a full ISO8601 duration format (like PnYnMnDTnHnMnS) we can use:

Also Avro doesn't have the Duration logical type in their the API impl (although it is documented since Avro 2.8)

@aymkhalil
Copy link
Collaborator Author

Drafting duration logical type: 6b43217

@cbornet
Copy link
Collaborator

cbornet commented Oct 11, 2022

Drafting duration logical type: 6b43217

With that we can add a field which is an AVRO duration to an AVRO datetime, is that correct ?

@cbornet
Copy link
Collaborator

cbornet commented Oct 11, 2022

java.util.Duration accepts the ISO format PnYnMnD (notice no hours, min, seconds, etc)

Do you mean java.util.time.Period ?

@cbornet
Copy link
Collaborator

cbornet commented Oct 11, 2022

In the JDK, a Duration is an exact amount of time. So it can't be expressed in months or years because these have variable time lengths.
For this reason it's not possible to map an AVRO duration to a j.u.t.Duration.
A Period is a variable amount of time that takes into account leap days, Daylight-Saving-Time and so on. It would be more appropriate to map an AVRO duration LT. I don't know why Period doesn't have h/m/s.
Instead of using joda which is in maintenance mode, we could probably use Threeten-extra lib's PeriodDuration ?
Note that the diff between Duration and Period only appears on ZonedDateTime which we don't support atm so there shouldn't be much trouble.

@aymkhalil
Copy link
Collaborator Author

Yes in general we'll need a Period based implementation, will take a look at the lib you proposed.

@aymkhalil
Copy link
Collaborator Author

Drafting duration logical type: 6b43217

With that we can add a field which is an AVRO duration to an AVRO datetime, is that correct ?

The draft PR only proposes a new type called DURATION and maps to AVRO's logical duration type, the date utility methods for adding will need to be supported (the plus for example which the dateadd methods relies on understands absolute unit in time but not periods)

@aymkhalil
Copy link
Collaborator Author

I'm starting to realize that the Avro logical type called Duration is actually a Period in java terms...

@cbornet
Copy link
Collaborator

cbornet commented Oct 12, 2022

Yes it is.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants