在 R 中的字符串中查找连续值

Question

I am trying to find 3 or more consecutive "a" within the last 10 letters of my data frame string.我试图在我的数据框字符串的最后 10 个字母中找到 3 个或更多连续的“a”。 My data frame looks like this:我的数据框如下所示：

V1
aaashkjnlkdjfoin
jbfkjdnsnkjaaaas
djshbdkjaaabdfkj
jbdfkjaaajbfjna
ndjksnsjksdnakns
aaaandfjhsnsjna

I have written this code, however it just gets out the number of consecutive "a" within the whole string.我已经编写了这段代码，但是它只是得到了整个字符串中连续“a”的数量。 However, I am wanting to do it so it only looks at the last 10 digits and then prints the string where the consecutive "a" are found.但是，我想这样做，所以它只查看最后 10 位数字，然后打印找到连续“a”的字符串。 The code I have wrote is:我写的代码是：

out: [1] 3

I am wanting my output to look like this:我希望我的 output 看起来像这样：

jbfkjdnsnkjaaaas
djshbdkjaaabdfkj
jbdfkjaaajbfjna

Can anyone help谁能帮忙

Answer 1

Using regex, you could do:使用正则表达式，您可以执行以下操作：

grep("(?=.{10}$).*?a{3,}", string, perl = TRUE, value = TRUE)
[1] "jbfkjdnsnkjaaaas" "djshbdkjaaabdfkj" "jbdfkjaaajbfjna"

string <- c("aaashkjnlkdjfoin", "jbfkjdnsnkjaaaas", "djshbdkjaaabdfkj", 
            "jbdfkjaaajbfjna", "ndjksnsjksdnakns", "aaaandfjhsnsjna")

If you have a dataframe and need tosubset it:如果你有一个 dataframe 并且需要子集：

subset(df, grepl("(?=.{10}$).*?a{3}",V1, perl = TRUE))
                V1
2 jbfkjdnsnkjaaaas
3 djshbdkjaaabdfkj
4  jbdfkjaaajbfjna

在 R 中的字符串中查找连续值

问题描述

1 个解决方案

解决方案1
0 2022-08-26 19:09:46

在 R 中的字符串中查找连续值

问题描述

1 个解决方案

解决方案1 0 2022-08-26 19:09:46

解决方案1
0 2022-08-26 19:09:46