[英]How to overwrite a factor in R
I have a dataset: 我有一个数据集:
> k
EVTYPE FATALITIES INJURIES
198704 HEAT 583 0
862634 WIND 158 1150
68670 WIND 116 785
148852 WIND 114 597
355128 HEAT 99 0
67884 WIND 90 1228
46309 WIND 75 270
371112 HEAT 74 135
230927 HEAT 67 0
78567 WIND 57 504
The variables are as follows. 变量如下。 As per the first answer by joran, unused levels can be dropped by
droplevels
, so no worry about the 898 levels, the illustrative k
I'm showing is the complete dataset obtained from k <- d1[1:10, 3:4]
where d1
is the original dataset. 根据joran的第一个答案,未使用的级别可以通过
droplevels
,因此不必担心898级,我展示的示例性k
是从k <- d1[1:10, 3:4]
droplevels
k <- d1[1:10, 3:4]
获得的完整数据集其中d1
是原始数据集。
> str(k)
'data.frame': 10 obs. of 3 variables:
$ EVTYPE : Factor w/ 898 levels " HIGH SURF ADVISORY",..: 243 NA NA NA 243 NA NA 243 243 NA
$ FATALITIES: num 583 158 116 114 99 90 75 74 67 57
$ INJURIES : num 0 1150 785 597 0 ...
I'm trying to overwrite the WIND
factor: 我正在尝试覆盖
WIND
因素:
> k[k$EVTYPE==factor("WIND"), ]$EVTYPE <- factor("AFDAF")
> k[k$EVTYPE=="WIND", ]$EVTYPE <- factor("AFDAF")
But both commands give me error messages: level sets of factors are different
or invalid factor level, NA generated
. 但是这两个命令都给我错误消息:
level sets of factors are different
或invalid factor level, NA generated
。
How should I do this? 我应该怎么做?
Try this instead: 尝试以下方法:
k <- droplevels(d1[1:10, 3:5])
Factors (as per the documentation) are simply a vector of integer codes and then a simple vector of labels for each code. 因子(根据文档)仅仅是整数代码的向量,然后是每个代码的标签的向量。 These are called the "levels".
这些被称为“级别”。 The levels are an attribute, and persist with your data even when subsetting.
级别是一个属性,即使进行子设置,也将与您的数据保持一致。
This is a feature , since for many statistical procedures it is vital to keep track of all the possible values that variable could have, even if they don't appear in the actual data. 这是一个功能 ,因为对于许多统计程序而言,跟踪变量可能具有的所有可能值(即使它们未出现在实际数据中)至关重要。
Some people find this irritation and run R using options(stringsAsFactors = FALSE)
. 有人发现这种刺激并使用
options(stringsAsFactors = FALSE)
运行R。
To simply change the levels, you can do something like this: 要简单地更改级别,您可以执行以下操作:
d <- read.table(text = " EVTYPE FATALITIES INJURIES
198704 HEAT 583 0
862634 WIND 158 1150
68670 WIND 116 785
148852 WIND 114 597
355128 HEAT 99 0
67884 WIND 90 1228
46309 WIND 75 270
371112 HEAT 74 135
230927 HEAT 67 0
78567 WIND 57 504",header = TRUE,sep = "",stringsAsFactors = TRUE)
> str(d)
'data.frame': 10 obs. of 3 variables:
$ EVTYPE : Factor w/ 2 levels "HEAT","WIND": 1 2 2 2 1 2 2 1 1 2
$ FATALITIES: int 583 158 116 114 99 90 75 74 67 57
$ INJURIES : int 0 1150 785 597 0 1228 270 135 0 504
> levels(d$EVTYPE) <- c('A','B')
> str(d)
'data.frame': 10 obs. of 3 variables:
$ EVTYPE : Factor w/ 2 levels "A","B": 1 2 2 2 1 2 2 1 1 2
$ FATALITIES: int 583 158 116 114 99 90 75 74 67 57
$ INJURIES : int 0 1150 785 597 0 1228 270 135 0 504
Or to just change one: 或只更改一个:
levels(d$EVTYPE)[2] <- 'C'
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.