如何在R中使用dplyr过滤2个模式之间的所有行

Question

I would like to filter all rows between 2 patterns which follow a numerical order.我想过滤遵循数字顺序的 2 个模式之间的所有行。 For eg how could I filter all rows > 1st.7.1.* & < 1st.13.1.*例如，我如何过滤所有行 > 1st.7.1.* & < 1st.13.1.*

Here is how the dataframe looks like这是数据框的样子

Answer 1

We may use parse_number to get the numeric part and then do the filter我们可以使用parse_number来获取数字部分，然后进行filter

library(dplyr)
df1 %>%
    filter(between(readr::parse_number(ball), 7.1, 13.1))

Or another option is to extract the substring and filter或者另一种选择是提取子字符串并filter

library(stringr)
df1 %>% 
   filter(between(as.numeric(str_extract(ball, "\\d+(\\.\\d+)?$")), 7.1, 13.1))

-output -输出

# A tibble: 61 × 2
   ball    team       
   <chr>   <chr>      
 1 1st.7.1 New Zealand
 2 1st.7.2 New Zealand
 3 1st.7.3 New Zealand
 4 1st.7.4 New Zealand
 5 1st.7.5 New Zealand
 6 1st.7.6 New Zealand
 7 1st.7.7 New Zealand
 8 1st.7.8 New Zealand
 9 1st.7.9 New Zealand
10 1st.8   New Zealand
# … with 51 more rows

data数据

df1 <- tibble(ball = str_c('1st.', seq(0.1, 13.5, by = 0.1)), team = 'New Zealand')

Answer 2

You can extract the numerical part and subset on this:您可以在此提取数字部分和子集：

library(stringr)
df %>%
  mutate(num = as.numeric(str_extract(ball, "(?<=st\\.).*"))) %>%
  filter(num > 7.1 & num < 13.1) %>%
  select(-num)
     ball
1 1st.10.9
2 1st.12.7

Data:数据：

df <- data.frame(
  ball = c("1st.7.1","1st.7.9", "1st.12.7", "1st.13.1")
)

Answer 3

We could remove the constant 1st.我们可以删除常量1st. and use the numbers.并使用数字。 Here I changed the range to show the effect on the the provided data.在这里，我更改了范围以显示对提供的数据的影响。

library(dplyr)
library(stringr)
df %>% 
  filter(between(as.numeric(stringr::str_remove(ball, "1st.")), 0.1, 1.1))

     ball        team     batsman              bowler  nonStriker byes legbyes noballs
1 1st.0.1 New Zealand  MJ Guptill Shaheen Shah Afridi DJ Mitchell    0       0       0
2 1st.0.2 New Zealand  MJ Guptill Shaheen Shah Afridi DJ Mitchell    0       0       0
3 1st.0.3 New Zealand  MJ Guptill Shaheen Shah Afridi DJ Mitchell    0       0       0
4 1st.0.4 New Zealand  MJ Guptill Shaheen Shah Afridi DJ Mitchell    0       0       0
5 1st.0.5 New Zealand  MJ Guptill Shaheen Shah Afridi DJ Mitchell    0       0       0
6 1st.0.6 New Zealand  MJ Guptill Shaheen Shah Afridi DJ Mitchell    0       0       0
7 1st.1.1 New Zealand DJ Mitchell          Imad Wasim  MJ Guptill    0       0       0

structure(list(ball = c("1st.0.1", "1st.0.2", "1st.0.3", "1st.0.4", 
"1st.0.5", "1st.0.6", "1st.1.1", "1st.1.2", "1st.1.3", "1st.1.4", 
"1st.1.5", "1st.1.6", "1st.2.1", "1st.2.2"), team = c("New Zealand", 
"New Zealand", "New Zealand", "New Zealand", "New Zealand", "New Zealand", 
"New Zealand", "New Zealand", "New Zealand", "New Zealand", "New Zealand", 
"New Zealand", "New Zealand", "New Zealand"), batsman = c("MJ Guptill", 
"MJ Guptill", "MJ Guptill", "MJ Guptill", "MJ Guptill", "MJ Guptill", 
"DJ Mitchell", "DJ Mitchell", "MJ Guptill", "MJ Guptill", "DJ Mitchell", 
"MJ Guptill", "DJ Mitchell", "DJ Mitchell"), bowler = c("Shaheen Shah Afridi", 
"Shaheen Shah Afridi", "Shaheen Shah Afridi", "Shaheen Shah Afridi", 
"Shaheen Shah Afridi", "Shaheen Shah Afridi", "Imad Wasim", "Imad Wasim", 
"Imad Wasim", "Imad Wasim", "Imad Wasim", "Imad Wasim", "Shaheen Shah Afridi", 
"Shaheen Shah Afrid"), nonStriker = c("DJ Mitchell", "DJ Mitchell", 
"DJ Mitchell", "DJ Mitchell", "DJ Mitchell", "DJ Mitchell", "MJ Guptill", 
"MJ Guptill", "DJ Mitchell", "DJ Mitchell", "MJ Guptill", "DJ Mitchell", 
"MJ Guptill", "MJ Guptill"), byes = c(0L, 0L, 0L, 0L, 0L, 0L, 
0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L), legbyes = c(0L, 0L, 0L, 0L, 
0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L), noballs = c(0L, 0L, 
0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L)), class = "data.frame", row.names = c(NA, 
-14L))

如何在R中使用dplyr过滤2个模式之间的所有行

问题描述

3 个解决方案

解决方案1
3 已采纳 2021-10-27 15:30:14

data数据

解决方案2
1 2021-10-27 15:44:58

解决方案3
1 2021-10-27 15:48:45

如何在R中使用dplyr过滤2个模式之间的所有行

问题描述

3 个解决方案

解决方案1 3 已采纳 2021-10-27 15:30:14

data数据

解决方案2 1 2021-10-27 15:44:58

解决方案3 1 2021-10-27 15:48:45

解决方案1
3 已采纳 2021-10-27 15:30:14

解决方案2
1 2021-10-27 15:44:58

解决方案3
1 2021-10-27 15:48:45