简体   繁体   English

R data.table NA 类型一致性

[英]R data.table NA type consistency

dt = data.table(x = c(1,1,2,2,2,2,3,3,3,3))
dt[, y := if(.N > 2) .N else NA, by = x] # fail
dt[, y := if(.N > 2) .N else NA_integer_, by = x] # good

This first grouping fails because NA has a type and it's not integer.第一个分组失败,因为NA有一个类型并且它不是整数。 Is there a way to tell data table to ignore that and try to make all NAs to whatever type that keeps consistency?有没有办法告诉数据表忽略它并尝试使所有 NA 成为保持一致性的任何类型?

I can manually set NA_integer here, but if I have lots of columns of different types, it's hard to set all NA type correct.我可以在这里手动设置NA_integer ,但是如果我有很多不同类型的列,则很难将所有 NA 类型设置为正确。

BTW, what NA type should I use for Date/IDate/ITime?顺便说一句,我应该为 Date/IDate/ITime 使用什么 NA 类型?

OP's first question: Is there a way to tell data table to ignore that and try to make all NAs to whatever type that keeps consistency? OP 的第一个问题:有没有办法告诉数据表忽略它并尝试使所有 NA 成为保持一致性的任何类型?

No. You'll see a similar error without the assignment:不。如果没有赋值,您会看到类似的错误:

dt[, if(.N > 2) .N else NA, by = x]
#  Error in `[.data.table`(dt, , if (.N > 2) .N else NA, by = x) : 
# Column 1 of result for group 2 is type 'integer' but expecting type 'logical'. Column types must be consistent for each group.

In my opinion, this "Column types must be consistent for each group."在我看来,这是“每个组的列类型必须一致”。 message should be shown for your case as well.您的案例也应显示消息。


OP's second question: BTW, what NA type should I use for Date/IDate/ITime? OP 的第二个问题:顺便说一句,我应该为 Date/IDate/ITime 使用什么 NA 类型?

For IDate et al, I always subset by NA_integer_ , which seems to give a length-one NA slice, eg, as.IDate(Sys.Date())[NA_integer_] .对于 IDate 等人,我总是按NA_integer_子集NA_integer_ ,这似乎给出了一个长度为 1 的 NA 切片,例如as.IDate(Sys.Date())[NA_integer_] I don't know if that's what one should do, but I don't know of a better idea.我不知道这是否是人们应该做的,但我不知道有什么更好的主意。 An illustration:一个例证:

z = IDateTime(factor(Sys.time()))
#         idate    itime
# 1: 2016-08-01 16:05:25

str( lapply(z, function(x) x[NA_integer_]) )
# List of 2
#  $ idate: IDate[1:1], format: NA
#  $ itime:Class 'ITime'  int NA

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM