-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[CALCITE-6551] Add DATE_FORMAT function (enabled in MySQL library) #3936
base: main
Are you sure you want to change the base?
Conversation
f.checkString("date_format(timestamp '1997-10-04 22:23:00.000000', '%c %D %e %f %h')", | ||
"10 4th 4 000000 10", | ||
"VARCHAR NOT NULL"); | ||
f.checkString("date_format(timestamp '1997-01-04 22:23:00', '%w %U %u %V %v %X %x')", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here are 4 cases testing mysql's weekForYear in two modes.
https://stackoverflow.com/questions/30364141/mysql-convert-yearweek-to-date
// f.checkString("date_format('1999-13-01', '%X %V')", | ||
// null, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
MySQL supports this writing and will return null if the timestamp type is not declared and the parameter does not satisfy the timestamp format, which seems to be difficult to do in calcite
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what you need is to use an implicit cast to timestamp if the input is a CHAR data type.
you should be able to support that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, implicit cast to timestamp this can be done in calcite, but if this cast fails, calcite will throw an exception, but for mysql, it will return null.
This is something that I don't think is easy to implement. unless we add a Config that adds a Config in the checksum phase and returns NULL if the implicit cast fails. WDYT
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see, the date is incorrect.
In this case indeed you can probably not match exactly the behavior of mysql.
But I think you should still make it a negative test and leave the comment in.
@@ -4919,7 +4920,8 @@ void testBitGetFunc(SqlOperatorFixture f, String functionName) { | |||
} | |||
|
|||
@Test void testToCharPg() { | |||
final SqlOperatorFixture f = fixture().withLibrary(SqlLibrary.POSTGRESQL); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I noticed that the default tester doesn't check the results, so I added this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We check the results by CalciteSqlOperatorTest. In SqlOperatorTest we check the return type.
a9118e5
to
3ee4432
Compare
The local time zone causes incorrect results and I need to check how to fix it. |
If you want deterministic tests I think you should not use the local timezone. |
@mihaibudiu is right, tests should be timezone agnostic, see Cassandra tests for an example. |
@mihaibudiu @asolimando I found this problem with all of this test classes, if the time zone is changed arbitrarily. For Calender's test, we need to set the time zone at this time like SqlOperatorTest. I'm not sure if this should be fixed in this PR . |
You can see a solution in e.g., #3805, where the timezone is set before the tests and reset afterwards. It's sufficient to make your own tests pass. |
I've fixed my test by keeping the data consistent with other tests. But I think in some time zones, some tests may not pass if we don't specify the time zone. |
@Override public void format(StringBuilder sb, Date date) { | ||
final Calendar calendar = Work.get().calendar; | ||
calendar.setTime(date); | ||
// It exceeds the precision of Calendar, we fill it with 0. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Milliseconds are usually between 0 and 999.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is an incorrect function description, it means microsecond here, I'll fix it
@@ -221,6 +245,33 @@ public enum FormatElementEnum implements FormatElement { | |||
sb.append(String.format(Locale.ROOT, "%02d", calendar.get(Calendar.WEEK_OF_YEAR))); | |||
} | |||
}, | |||
V("", "The number of the year (Monday as the first day of the week)," |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the comment seems wrong, it is about the week number
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1, i will fix it
@@ -399,6 +480,27 @@ public enum FormatElementEnum implements FormatElement { | |||
sb.append(work.yyyyFormat.format(date)); | |||
} | |||
}, | |||
WFY("", "The year for the week where Sunday is the first day of the week") { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this should say "the week of the year"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The result returned by this symbol is the number of years, "the week of the year" might not be appropriate
sb.append(String.format(Locale.ROOT, "%04d", calendar.getWeekYear())); | ||
} | ||
}, | ||
WFY0("", "The year for the week where Monday is the first day of the week") { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same problem
final Locale originalLocale = Locale.getDefault(); | ||
|
||
try { | ||
Locale.setDefault(Locale.US); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You should not need to set the locale in the test if these functions are supposed to be deterministic.
Perhaps the function implementation should set the locale.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For datetime functions, the implementation should be based on the user's time zone, just as other arbitrary databases support setting time zones. And other test do it.
And other tests have done the same
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are three distinct data types in Calcite:
TIMESTAMP
TIMESTAMP WITH LOCAL TIME ZONE
TIMESTAMP WITH TIME ZONE
You have to specify the type these functions operate on. I think you probably want functions for each data type. The ones for TIMESTAMP and TIMESTAMP WITH TIMEZONE should be deterministic. And the ones for LOCAL TIME ZONE should be implemented in terms of the other ones by setting the timezone first. But you should not rely on the Java system timezone for this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this require more testing, including for BC dates.
// f.checkString("date_format('1999-13-01', '%X %V')", | ||
// null, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what you need is to use an implicit cast to timestamp if the input is a CHAR data type.
you should be able to support that.
"VARCHAR NOT NULL"); | ||
f.checkString("date_format(timestamp '1997-10-04 22:23:00', '%V %v %X %x %%')", | ||
"39 40 1997 1997 %", | ||
"VARCHAR NOT NULL"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you add some tests with dates before the Gregorian calendar?
I found that some of these fail.
Also, can you say how you validated the outputs?
Are these tested in MySQL?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It throws an exception:Illegal TIMESTAMP literal, In the timestamp scenario, this is consistent with mysql's result.
Yes, I do these tests in mysql . In fact, I write these test cases in mysql and get the results, and then add them to test to validate
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Gregorian Calendar starts in 1582: https://en.wikipedia.org/wiki/Gregorian_calendar
I don't expect that mysql fails in this case.
Perhaps it fails for dates BC.
|
||
try { | ||
Locale.setDefault(Locale.US); | ||
f.checkString("date_format(timestamp '2009-10-04 22:23:00', '%W %M %Y')", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does 0000-01-01 or BC date support format?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tested this example in mysql.
SELECT DATE_FORMAT('0000-01-01', '%X %V');
return
0000-01
but
SELECT DATE_FORMAT(timestamp '0000-01-01', '%X %V');
return error: Incorrect DATETIME value: '0000-01-01'
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It throws an exception:Illegal TIMESTAMP literal, In the timestamp scenario, this is consistent with mysql's result.
In the Calcite test, this throws an exception because it can't be converted to a timestamp
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, Calcite has its own date standard.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SELECT DATE_FORMAT(timestamp '0000-01-01', '%X %V');
This failed because '0000-01-01' is not a timestamp. You can try:
SELECT DATE_FORMAT(data '0000-01-01', '%X %V');
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SELECT DATE_FORMAT(timestamp '0000-01-01', '%X %V');
This failed because '0000-01-01' is not a timestamp. You can try:
SELECT DATE_FORMAT(data '0000-01-01', '%X %V');
0000-01-01 is not even a valid date according to the SQL standard. 0000-01-01 is not even a valid date according to the SQL standard. It is difficult to implement the date_format standard in all dialects. Maybe we need to make some decisions
Quality Gate passedIssues Measures |
@@ -2944,6 +2944,7 @@ In the following: | |||
| b | TIME_SUB(time, interval) | Returns the TIME value that is *interval* before *time* | |||
| b | TIME_TRUNC(time, timeUnit) | Truncates *time* to the granularity of *timeUnit*, rounding to the beginning of the unit | |||
| m o p r | TO_CHAR(timestamp, format) | Converts *timestamp* to a string using the format *format* | |||
| m | DATE_FORMAT(timestamp, format) | Converts *timestamp* to a string using the format *format* |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do DATE-FORMAT and TO_CHAR have the same functional description? Are their functions confirmed to be consistent?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@@ -4919,7 +4920,8 @@ void testBitGetFunc(SqlOperatorFixture f, String functionName) { | |||
} | |||
|
|||
@Test void testToCharPg() { | |||
final SqlOperatorFixture f = fixture().withLibrary(SqlLibrary.POSTGRESQL); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We check the results by CalciteSqlOperatorTest. In SqlOperatorTest we check the return type.
|
||
try { | ||
Locale.setDefault(Locale.US); | ||
f.checkString("date_format(timestamp '2009-10-04 22:23:00', '%W %M %Y')", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SELECT DATE_FORMAT(timestamp '0000-01-01', '%X %V');
This failed because '0000-01-01' is not a timestamp. You can try:
SELECT DATE_FORMAT(data '0000-01-01', '%X %V');
f.checkString("date_format(timestamp '1997-10-04 22:23:00', '%H %k %I %r %T %S %w')", | ||
"22 22 10 10:23:00 PM 22:23:00 00 6", | ||
"VARCHAR NOT NULL"); | ||
f.checkString("date_format(timestamp '1999-01-01', '%X %V')", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In MySQL,This will throw exception ERROR 1525 (HY000) at line 1: Incorrect DATETIME value: '1994-01-01'
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I removed time to simplify testing Case, and in fact, Mysql doesn't support doing that. I'll add time to this test, but for this current test, I'm not sure we can be consistent with Mysql in Calcite
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We check the results by CalciteSqlOperatorTest. In SqlOperatorTest we check the return type.
It looks like CalciteSqlOperatorTest is better suited for this test, do I need to modify this test into CalciteSqlOperatorTest as this looks like it needs result checking too
This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 90 days if no further activity occurs. If you think that’s incorrect or this pull request requires a review, please simply write any comment. If closed, you can revive the PR at any time and @mention a reviewer or discuss it on the [email protected] list. Thank you for your contributions. |
Document: https://dev.mysql.com/doc/refman/8.4/en/date-and-time-functions.html#function_date-format