[英]in R, using data.table package, how to do a subset with special variable(with space)
I am reading a txt file with data.table package.我正在读取带有 data.table 包的 txt 文件。
df<- fread("df.txt")
head(df)
Number Region Type Car ...
1 1 1
2 1 2
3 1 1
4 1 1
5 2 2
6 2 3
I would like to do a subset of df with the Type Car iqual to 1 and 3. When I write something like this我想做一个 df 的子集,Type Car 等于 1 和 3。当我写这样的东西时
>class(df)
"data.table" "data.frame"
>subset(df, Type Car %in% c(1,3))
This does not work.这不起作用。 Some solution?
一些解决方案?
You've got a data table from fread()
(unless you used data.table = FALSE
), so you can use data table row subsetting instead of subset()
.您已经从
fread()
获得了一个数据表(除非您使用了data.table = FALSE
),因此您可以使用数据表行子集而不是subset()
。 Since you have a multi-word column name, you will need to apply back-ticks around it.由于您有一个多字列名,您需要在它周围应用反引号。
df[`Type Car` %in% c(1, 3)]
The same goes for subset()
if you choose to use it.如果您选择使用
subset()
,同样如此。 In fact, back-ticks will always be necessary when referencing multi-word names that contain spaces.事实上,在引用包含空格的多词名称时,反引号总是必要的。 It would be better to use qualified R names.
最好使用限定的 R 名称。 You can reset the names with
您可以使用以下方法重置名称
setnames(df, make.names(names(df), unique = TRUE))
so you can avoid the back-ticks.所以你可以避免反引号。 Then you could do
那么你可以做
df[Type.Car %in% c(1, 3)]
Note: In data.table version 1.9.6, you can now name the columns in fread()
with the col.names
argument.注意:在data.table版本 1.9.6 中,您现在可以使用
col.names
参数命名fread()
的列。 As Michael Chirico has mentioned, it's best to get this problem out of the way immediately.正如 Michael Chirico 所提到的,最好立即解决这个问题。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.