[英]For loops using grepl
I am trying to use a for loop to run through a data frame, see if an observation contains a certain string in a column (ie, it should contain "no law" in the column Content ), and generate values in a different column based on the outcome.我正在尝试使用 for 循环遍历数据框,查看观察是否在列中包含某个字符串(即,它应该在列Content中包含“no law”),并在基于不同的列中生成值关于结果。
If it does contain the string, which is identified by the outcome of the grepl
function being True , then the observation should have 'Permissive' in the Effectrp column;如果它确实包含字符串,该字符串由grepl
function 的结果标识为True ,那么观察应该在Effectrp列中具有“Permissive”; otherwise, it should say 'Restrictive'.否则,它应该说“限制性”。
I'm not quite sure what I'm doing wrong... Any help would be appreciated!我不太确定我做错了什么......任何帮助将不胜感激!
for (i in 1:nrow(ldb)){
if (grepl('no law', ldb$Content[i], ignore.case = TRUE)) == TRUE {
ldb$Effectrp[i] = 'Permissive'
} else {
lab$EffectTR[i] = 'Restrictive'
}
}
You shouldn't even need a for loop for this, as grepl
will return a vector if applied to a vector.您甚至不需要 for 循环,因为如果将grepl
应用于向量,它将返回一个向量。 You could try something like你可以尝试类似
ldb$Effectrp <- 'Restrictive'
lbd$Effectrp[grepl('no law', ldb$Content, ignore.case = TRUE)] <- 'Permissive'
(and, as mentioned in the previous answer, be careful about the typos in your data frame and column names.) (并且,如上一个答案中所述,请注意数据框和列名中的拼写错误。)
Using base R
:使用base R
:
ldb$EffectRP <- sapply(ldb$Content,
function(x) if (grepl("no law", x, ignore.case = TRUE)) {"Permissive"} else {"Restrictive"} )
Using dplyr
and stringr
:使用dplyr
和stringr
:
ldb %>%
mutate(EffectRP2 = ifelse(str_detect(Content, "no law"), "Permissive", "Restrictive"))
Those options return:这些选项返回:
Content EffectRP EffectRP2
1 law Restrictive Restrictive
2 no law Permissive Permissive
3 law Restrictive Restrictive
4 no law Permissive Permissive
5 law Restrictive Restrictive
6 law Restrictive Restrictive
7 no law Permissive Permissive
8 no law Permissive Permissive
9 no law Permissive Permissive
10 no law Permissive Permissive
Similar to the answer I wrote to this question.类似于我写给这个问题的答案。
The only potential problem I can see is typos in the else
part.我能看到的唯一潜在问题是else
部分的拼写错误。 You wrote lab$EffectTR
when earlier your dataframe was named ldb
and the column was named Effectrp
.当您早些时候将 dataframe 命名为ldb
并且列命名为 Effectrp 时,您编写了lab$EffectTR
Effectrp
。 Not sure if this is intentional.不确定这是否是故意的。
Clarification of the redundancy thing:澄清冗余的事情:
You don't need the == TRUE
in your if
statement.您的if
语句中不需要== TRUE
。 I think of it like this.我是这样想的。 Right now you have:现在你有:
if (grepl(check if my pattern is found in Content) == TRUE) {
do something
}
grepl returns TRUE/FALSE
, so let's say "no law"
is found in Content
, then grepl evaluates to TRUE
, producing: grepl 返回TRUE/FALSE
,所以假设在Content
中找到"no law"
,然后 grepl 评估为TRUE
,产生:
if (TRUE == TRUE) {
do something
}
If we continue evaluating the parentheses, we know that indeed, TRUE == TRUE
, so this reduces to:如果我们继续评估括号,我们确实知道TRUE == TRUE
,所以这简化为:
if (TRUE) {
do something
}
This is what we want.这就是我们想要的。 However, the extra check TRUE == TRUE
is unnecessary when you could just use the output from grepl like so:但是,当您可以只使用 grepl 中的 output 时,额外的检查TRUE == TRUE
是不必要的,如下所示:
if (grepl(check if my pattern is found in Content) {
do something
}
This will evaluate to:这将评估为:
if (TRUE) {
do something
}
the same thing as before, but you're skipping the redundant == TRUE
step.和以前一样,但是你跳过了多余的== TRUE
步骤。 ie the if statement will run, because the thing inside the parentheses is TRUE
in a quite literal sense.即 if 语句将运行,因为括号内的内容在字面意义上是TRUE
。
Hope that makes more sense.希望这更有意义。 It was confusing to me when I first learned it as well.当我第一次学习它时,我也感到困惑。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.