isOrdered results differ for NA_integer_ and NA_real_ #221

joshuaulrich · 2017-12-20T17:12:00Z

The isOrdered() function returns different results depending on whether the input is integer or double and includes a NA:

run.isOrdered <- function(x) {
  c(isOrdered(x, increasing =  TRUE, strictly =  TRUE),
    isOrdered(x, increasing =  TRUE, strictly = FALSE),
    isOrdered(x, increasing = FALSE, strictly = FALSE),
    isOrdered(x, increasing = FALSE, strictly =  TRUE))
}
### Why are these different?
# Integers are always FALSE
run.isOrdered(c(1L, NA_integer_, 1L))
# [1] FALSE FALSE FALSE FALSE
run.isOrdered(c(0L, NA_integer_, 1L))
# [1] FALSE FALSE FALSE FALSE
run.isOrdered(c(1L, NA_integer_, 0L))
# [1] FALSE FALSE FALSE FALSE

run.isOrdered(c(0L, 1L, NA_integer_, 2L))
# [1] FALSE FALSE FALSE FALSE
run.isOrdered(c(0L, 1L, NA_integer_, 1L))
# [1] FALSE FALSE FALSE FALSE
run.isOrdered(c(0L, 0L, NA_integer_, 1L))
# [1] FALSE FALSE FALSE FALSE
run.isOrdered(c(0L, 0L, NA_integer_, 0L))
# [1] FALSE FALSE FALSE FALSE

run.isOrdered(c(2L, NA_integer_, 1L, 0L))
# [1] FALSE FALSE FALSE FALSE
run.isOrdered(c(1L, NA_integer_, 1L, 0L))
# [1] FALSE FALSE FALSE FALSE
run.isOrdered(c(1L, NA_integer_, 0L, 0L))
# [1] FALSE FALSE FALSE FALSE
run.isOrdered(c(0L, NA_integer_, 0L, 0L))
# [1] FALSE FALSE FALSE FALSE


# Doubles are all over the place
run.isOrdered(c(1.0, NA_real_, 1.0))
# [1] TRUE TRUE TRUE TRUE
run.isOrdered(c(0.0, NA_real_, 1.0))
# [1] TRUE TRUE TRUE TRUE
run.isOrdered(c(1.0, NA_real_, 0.0))
# [1] TRUE TRUE TRUE TRUE

run.isOrdered(c(0.0, 1.0, NA_real_, 2.0))
# [1]  TRUE  TRUE FALSE FALSE
run.isOrdered(c(0.0, 1.0, NA_real_, 1.0))
# [1]  TRUE  TRUE FALSE FALSE
run.isOrdered(c(0.0, 0.0, NA_real_, 1.0))
# [1] FALSE  TRUE  TRUE FALSE
run.isOrdered(c(0.0, 0.0, NA_real_, 0.0))
# [1] FALSE  TRUE  TRUE FALSE

run.isOrdered(c(2.0, NA_real_, 1.0, 0.0))
# [1] FALSE FALSE  TRUE  TRUE
run.isOrdered(c(1.0, NA_real_, 1.0, 0.0))
# [1] FALSE FALSE  TRUE  TRUE
run.isOrdered(c(1.0, NA_real_, 0.0, 0.0))
# [1] FALSE  TRUE  TRUE FALSE
run.isOrdered(c(0.0, NA_real_, 0.0, 0.0))
# [1] FALSE  TRUE  TRUE FALSE

The text was updated successfully, but these errors were encountered:

joshuaulrich · 2018-04-18T21:21:45Z

Places where this behavior may affect user code: Subsetting in [.xts calls isOrdered() on i and/or on the output from calls to binsearch(). It's also called on the INDEX argument to period.apply(). These calls are in files xts.methods.R and periodicity.R, respectively.

The other calls in the grep output below are on the index attribute (or a vector that will be used as an index), so they shouldn't be allowed to contain NA anyway.

> grep isOrdered *
align.time.R:  isOrdered(.index(x), strictly=TRUE)
align.time.R:  isOrdered(index(x), strictly=TRUE)
index.R:  if(!isOrdered(.index(x), strictly=FALSE))
isOrdered.R:`isOrdered` <- function(x, increasing=TRUE, strictly=TRUE) {
periodicity.R:    if(!isOrdered(INDEX)) {
periodicity.R:      # isOrdered returns FALSE if there are duplicates
xts.methods.R:          if(isOrdered(firstlast, strictly=FALSE)) # fixed non-match within range bug
xts.methods.R:    if(!isOrdered(i,strictly=FALSE)) {
xts.R:  if(!isOrdered(order.by, strictly=!unique)) {
xts.R:    if( !isOrdered(index, increasing=TRUE, strictly=unique) )
xts.R:    if( !isOrdered(index, increasing=TRUE, strictly=unique) )

TomAndrews · 2018-04-19T09:46:10Z

I think the reason that all the integer cases are FALSE is that NA_INTEGER = INT_MIN:
https://github.com/wch/r-source/blob/8a55192af9a65291afffb64c22b29801ea9151a6/src/include/R_ext/Arith.h#L49
So all those cases must be false since there's a very large negative number in the middle of all of them.

For doubles R is using IEEE NaN values:
https://github.com/wch/r-source/blob/8a55192af9a65291afffb64c22b29801ea9151a6/src/main/arithmetic.c#L112

According to the standard, any ordering comparison with NaN returns false:
https://en.wikipedia.org/wiki/NaN

This explains the double cases:

With three values, every comparison will include an NA so it falls through to the final return TRUE
With four values, the 2 comparisons including an NA don't trigger a return so the results are the same as if you just use the two adjacent real values:

run.isOrdered(c(0.0, 1.0))
# [1]  TRUE  TRUE FALSE FALSE
run.isOrdered(c(0.0, 1.0))
# [1]  TRUE  TRUE FALSE FALSE
run.isOrdered(c(0.0, 0.0))
# [1] FALSE  TRUE  TRUE FALSE
run.isOrdered(c(0.0, 0.0))
# [1] FALSE  TRUE  TRUE FALSE

run.isOrdered(c(1.0, 0.0))
# [1] FALSE FALSE  TRUE  TRUE
run.isOrdered(c(1.0, 0.0))
# [1] FALSE FALSE  TRUE  TRUE
run.isOrdered(c(0.0, 0.0))
# [1] FALSE  TRUE  TRUE FALSE
run.isOrdered(c(0.0, 0.0))
# [1] FALSE  TRUE  TRUE FALSE

Therefore you get results like this:

isOrdered(c(0.0, 1.0, NA_real_, 0.0, 1.0))
# [1] TRUE

joshuaulrich · 2018-08-06T13:24:08Z

Note that zoo orders the index using it's ORDER() function, where the default na.last = TRUE means that NA are always at the end of the index for zoo objects, regardless of the atomic type of the index.

joshuaulrich mentioned this issue Apr 18, 2018

Illegal read in do_is_ordered #236

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

isOrdered results differ for NA_integer_ and NA_real_ #221

isOrdered results differ for NA_integer_ and NA_real_ #221

joshuaulrich commented Dec 20, 2017

joshuaulrich commented Apr 18, 2018

TomAndrews commented Apr 19, 2018

joshuaulrich commented Aug 6, 2018

isOrdered results differ for NA_integer_ and NA_real_ #221

isOrdered results differ for NA_integer_ and NA_real_ #221

Comments

joshuaulrich commented Dec 20, 2017

joshuaulrich commented Apr 18, 2018

TomAndrews commented Apr 19, 2018

joshuaulrich commented Aug 6, 2018