简体   繁体   English

在R中,使用data.table包,如何使用特殊变量(带空格)做一个子集

[英]in R, using data.table package, how to do a subset with special variable(with space)

I am reading a txt file with data.table package.我正在读取带有 data.table 包的 txt 文件。

df<- fread("df.txt")
head(df)
Number Region Type Car ...
     1       1       1
     2       1       2
     3       1       1  
     4       1       1
     5       2       2
     6       2       3

I would like to do a subset of df with the Type Car iqual to 1 and 3. When I write something like this我想做一个 df 的子集,Type Car 等于 1 和 3。当我写这样的东西时

>class(df)
"data.table" "data.frame"
>subset(df, Type Car %in% c(1,3))

This does not work.这不起作用。 Some solution?一些解决方案?

You've got a data table from fread() (unless you used data.table = FALSE ), so you can use data table row subsetting instead of subset() .您已经从fread()获得了一个数据表(除非您使用了data.table = FALSE ),因此您可以使用数据表行子集而不是subset() Since you have a multi-word column name, you will need to apply back-ticks around it.由于您有一个多字列名,您需要在它周围应用反引号。

df[`Type Car` %in% c(1, 3)]

The same goes for subset() if you choose to use it.如果您选择使用subset() ,同样如此。 In fact, back-ticks will always be necessary when referencing multi-word names that contain spaces.事实上,在引用包含空格的多词名称时,反引号总是必要的。 It would be better to use qualified R names.最好使用限定的 R 名称。 You can reset the names with您可以使用以下方法重置名称

setnames(df, make.names(names(df), unique = TRUE))

so you can avoid the back-ticks.所以你可以避免反引号。 Then you could do那么你可以做

df[Type.Car %in% c(1, 3)]

Note: In data.table version 1.9.6, you can now name the columns in fread() with the col.names argument.注意:data.table版本 1.9.6 中,您现在可以使用col.names参数命名fread()的列。 As Michael Chirico has mentioned, it's best to get this problem out of the way immediately.正如 Michael Chirico 所提到的,最好立即解决这个问题。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM