![](/img/trans.png)
[英]Using regular expression, how can I add elements after I find a match in r?
[英]R code: How can I filter X amount of elements after a string match?
我有一個字符串,其中包含從 pdf 中提取的多個元素。 我只想在字符串匹配后包含 5 個元素。 所以我有
c("Retail","Channel1","Discount","10/1/2019 20%","10/1/2020 20%","10/1/2021 20%",
"Fee", "Channel1", "10/1/2019 $5","10/1/2020 5%","10/1/2021 5%"
"Supply Chain", "Channel1","Discount", "10/1/2019 80%","10/1/2020 80%","10/1/2021 80%")
我想檢測“零售”,然后最多包含第一個“2021 年 10 月 1 日 20%”
然后我想檢測“費用”並包含最多“10/1/2021 5%”然后“供應鏈”並包含最多“10/1/2021/80%”
零售、費用和供應鏈將始終相同,但日期/百分比一直在變化。
使用tidyverse
:
v1 <- c("Retail", "Channel1", "Discount", "10/1/2019 20%", "10/1/2020 20%",
"10/1/2021 20%", "Fee", "Channel1", "10/1/2019 $5", "10/1/2020 5%",
"10/1/2021 5%", "Supply Chain", "Channel1", "Discount", "10/1/2019 80%",
"10/1/2020 80%", "10/1/2021 80%")
這里我們使用grepl
和cumsum
為每個字符串匹配創建一個分組變量。 然后我們 select 前 5 行。
library(tidyverse)
data.frame(v1) %>%
mutate(tag = cumsum(grepl("Retail|Fee|Supply Chain", v1))) %>%
group_by(tag) %>%
top_n(5)
Selecting by tag
# A tibble: 17 x 2
# Groups: tag [3]
v1 tag
<fct> <int>
1 Retail 1
2 Channel1 1
3 Discount 1
4 10/1/2019 20% 1
5 10/1/2020 20% 1
6 10/1/2021 20% 1
7 Fee 2
8 Channel1 2
9 10/1/2019 $5 2
10 10/1/2020 5% 2
11 10/1/2021 5% 2
12 Supply Chain 3
13 Channel1 3
14 Discount 3
15 10/1/2019 80% 3
16 10/1/2020 80% 3
17 10/1/2021 80% 3
這是一個帶有base R
的選項
lapply(tapply(v1, cumsum(v1 %in% c("Retail", "Fee", "Supply Chain")),
head, 6), tail, -1)
#$`1`
#[1] "Channel1" "Discount" "10/1/2019 20%" "10/1/2020 20%" "10/1/2021 20%"
#$`2`
#[1] "Channel1" "10/1/2019 $5" "10/1/2020 5%" "10/1/2021 5%"
#$`3`
#[1] "Channel1" "Discount" "10/1/2019 80%" "10/1/2020 80%" "10/1/2021 80%"
如果這還需要包括“零售”、“費用”、“供應鏈”
tapply(v1, cumsum(v1 %in% c("Retail", "Fee", "Supply Chain")), head, 6)
v1 <- c("Retail", "Channel1", "Discount", "10/1/2019 20%", "10/1/2020 20%",
"10/1/2021 20%", "Fee", "Channel1", "10/1/2019 $5", "10/1/2020 5%",
"10/1/2021 5%", "Supply Chain", "Channel1", "Discount", "10/1/2019 80%",
"10/1/2020 80%", "10/1/2021 80%")
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.