R 用于在 1000 次迭代后生成 NA 的循环

Question

I have a simple for loop that I'm using to remove any rows from my dataframe that involve two variables sharing a similar string and when I run the loop it iterates 1000 times and then starts generating NA's which breaks my loop.我有一个简单的 for 循环，用于从 dataframe 中删除任何行，这些行涉及共享相似字符串的两个变量，当我运行循环时，它会迭代 1000 次，然后开始生成 NA，这会破坏我的循环。

expiration到期	quote_datetime报价日期时间
2021-02-26 2021-02-26	2021-02-26 10:00:00 2021-02-26 10:00:00
2021-02-26 2021-02-26	2021-02-27 10:00:00 2021-02-27 10:00:00

for(row in 1:nrow(df)){
  if(grepl(df$expiration[row], df$quote_datetime[row],fixed=TRUE) == TRUE){
    df = df[-row,]
  }
}

I'm getting the error message我收到错误消息

Error in if (grepl(df$expiration[row], df$quote_datetime[row], : missing value where TRUE/FALSE needed if (grepl(df$expiration[row], df$quote_datetime[row], 中的错误：需要 TRUE/FALSE 的地方缺少值

Each time I run it it eliminates a few more rows until it runs out of anything else to eliminate and then it runs without error.每次我运行它时，它都会消除更多的行，直到它用完其他要消除的东西，然后它运行时没有错误。 Appreciate help.感谢帮助。

Answer 1

The issue arises because the original data 'df' gets subset if the condition is TRUE, ie it will be one row less for every if TRUE case.出现问题是因为if条件为 TRUE，原始数据 'df' 将获得子集，即对于每个if为 TRUE 的情况，它将少一行。 It could be resolved if we copy of the data如果我们复制数据就可以解决

df2 <- df
for(row in 1:nrow(df)){
   if(grepl(df$expiration[row], df$quote_datetime[row],fixed=TRUE)){
     df2 <- df2[-row,]
    }
   }

Also, grepl is vectorized only for the 'x' and not for the pattern So, if we need to do a vectorization, may need to paste the pattern together此外， grepl仅针对 'x' 而不是pattern进行矢量化所以，如果我们需要进行矢量化，可能需要将pattern paste在一起

df <- df[!grepl(paste(df$expiration, collapse="|"), 
              df$quote_datetime, fixed = TRUE), ]

Or use a function that does the vectorization for both 'x' and 'pattern ie str_detect或者使用 function 对 'x' 和 'pattern 即str_detect进行矢量化

library(dplyr)
library(stringr)
df %>%
   filter(!str_detect(quote_datetime, fixed(expiration))

R 用于在 1000 次迭代后生成 NA 的循环

问题描述

1 个解决方案

解决方案1
1 2021-03-14 19:05:43

R 用于在 1000 次迭代后生成 NA 的循环

问题描述

1 个解决方案

解决方案1 1 2021-03-14 19:05:43

解决方案1
1 2021-03-14 19:05:43