R: filtering a data.table with dplyr fails

Question

I am on Windows using R 4.0.2 and data.table 1.13.0 and dplyr 1.0.0

This is such a weird bug that I can't make reproducible example.

library(data.table)
df2 = structure(list(total_amount = 9.39999961853027, tip_amount = 0, 
               total_amount = 9.39999961853027, passenger_count = 1L), row.names = c(NA, 
        -1L), class = c("data.table", "data.frame"))

# this works
df2[total_amount > 10, ] 

# this works
df2 %>% 
  data.frame %>%
  filter(total_amount > 10)

# this doesn't work!!!
df2 %>% 
  filter(total_amount > 10)

and gives error Error in.subset2(chunks, self$get_current_group()): attempt to select less than one element in integerOneIndex

This is so perplexing. What is going?

Answer 1

The issues seems to be that if two columns have the SAME name then it errors.

Answer 2

The reason for this is that your data.table is badly designed. You have two columns called total_amount. How in this case is dplyr supposed to know what to do when filtering? It looks at your filter condition, and then looks for total_amount in the table. It finds two columns with that name and then rightly throws an error as there is no way of knowing which column to use. dplyr is doing what it should be doing. Essentially your data is not tidy, and that is what dplyr expects.

R: filtering a data.table with dplyr fails

Question

2 answers

solution1
2 2020-07-28 04:03:39

solution2
0 2020-07-28 09:05:48

R: filtering a data.table with dplyr fails

Question

2 answers

solution1 2 2020-07-28 04:03:39

solution2 0 2020-07-28 09:05:48

solution1
2 2020-07-28 04:03:39

solution2
0 2020-07-28 09:05:48