简体   繁体   English

如何在R中使用dplyr过滤2个模式之间的所有行

[英]How to filter all rows between 2 patterns using dplyr in R

I would like to filter all rows between 2 patterns which follow a numerical order.我想过滤遵循数字顺序的 2 个模式之间的所有行。 For eg how could I filter all rows > 1st.7.1.* & < 1st.13.1.*例如,我如何过滤所有行 > 1st.7.1.* & < 1st.13.1.*

Here is how the dataframe looks like这是数据框的样子

在此处输入图片说明

We may use parse_number to get the numeric part and then do the filter我们可以使用parse_number来获取数字部分,然后进行filter

library(dplyr)
df1 %>%
    filter(between(readr::parse_number(ball), 7.1, 13.1))

Or another option is to extract the substring and filter或者另一种选择是提取子字符串并filter

library(stringr)
df1 %>% 
   filter(between(as.numeric(str_extract(ball, "\\d+(\\.\\d+)?$")), 7.1, 13.1))

-output -输出

# A tibble: 61 × 2
   ball    team       
   <chr>   <chr>      
 1 1st.7.1 New Zealand
 2 1st.7.2 New Zealand
 3 1st.7.3 New Zealand
 4 1st.7.4 New Zealand
 5 1st.7.5 New Zealand
 6 1st.7.6 New Zealand
 7 1st.7.7 New Zealand
 8 1st.7.8 New Zealand
 9 1st.7.9 New Zealand
10 1st.8   New Zealand
# … with 51 more rows

data数据

df1 <- tibble(ball = str_c('1st.', seq(0.1, 13.5, by = 0.1)), team = 'New Zealand')

You can extract the numerical part and subset on this:您可以在此提取数字部分和子集:

library(stringr)
df %>%
  mutate(num = as.numeric(str_extract(ball, "(?<=st\\.).*"))) %>%
  filter(num > 7.1 & num < 13.1) %>%
  select(-num)
     ball
1 1st.10.9
2 1st.12.7

Data:数据:

df <- data.frame(
  ball = c("1st.7.1","1st.7.9", "1st.12.7", "1st.13.1")
)

We could remove the constant 1st.我们可以删除常量1st. and use the numbers.并使用数字。 Here I changed the range to show the effect on the the provided data.在这里,我更改了范围以显示对提供的数据的影响。

library(dplyr)
library(stringr)
df %>% 
  filter(between(as.numeric(stringr::str_remove(ball, "1st.")), 0.1, 1.1))
     ball        team     batsman              bowler  nonStriker byes legbyes noballs
1 1st.0.1 New Zealand  MJ Guptill Shaheen Shah Afridi DJ Mitchell    0       0       0
2 1st.0.2 New Zealand  MJ Guptill Shaheen Shah Afridi DJ Mitchell    0       0       0
3 1st.0.3 New Zealand  MJ Guptill Shaheen Shah Afridi DJ Mitchell    0       0       0
4 1st.0.4 New Zealand  MJ Guptill Shaheen Shah Afridi DJ Mitchell    0       0       0
5 1st.0.5 New Zealand  MJ Guptill Shaheen Shah Afridi DJ Mitchell    0       0       0
6 1st.0.6 New Zealand  MJ Guptill Shaheen Shah Afridi DJ Mitchell    0       0       0
7 1st.1.1 New Zealand DJ Mitchell          Imad Wasim  MJ Guptill    0       0       0

structure(list(ball = c("1st.0.1", "1st.0.2", "1st.0.3", "1st.0.4", 
"1st.0.5", "1st.0.6", "1st.1.1", "1st.1.2", "1st.1.3", "1st.1.4", 
"1st.1.5", "1st.1.6", "1st.2.1", "1st.2.2"), team = c("New Zealand", 
"New Zealand", "New Zealand", "New Zealand", "New Zealand", "New Zealand", 
"New Zealand", "New Zealand", "New Zealand", "New Zealand", "New Zealand", 
"New Zealand", "New Zealand", "New Zealand"), batsman = c("MJ Guptill", 
"MJ Guptill", "MJ Guptill", "MJ Guptill", "MJ Guptill", "MJ Guptill", 
"DJ Mitchell", "DJ Mitchell", "MJ Guptill", "MJ Guptill", "DJ Mitchell", 
"MJ Guptill", "DJ Mitchell", "DJ Mitchell"), bowler = c("Shaheen Shah Afridi", 
"Shaheen Shah Afridi", "Shaheen Shah Afridi", "Shaheen Shah Afridi", 
"Shaheen Shah Afridi", "Shaheen Shah Afridi", "Imad Wasim", "Imad Wasim", 
"Imad Wasim", "Imad Wasim", "Imad Wasim", "Imad Wasim", "Shaheen Shah Afridi", 
"Shaheen Shah Afrid"), nonStriker = c("DJ Mitchell", "DJ Mitchell", 
"DJ Mitchell", "DJ Mitchell", "DJ Mitchell", "DJ Mitchell", "MJ Guptill", 
"MJ Guptill", "DJ Mitchell", "DJ Mitchell", "MJ Guptill", "DJ Mitchell", 
"MJ Guptill", "MJ Guptill"), byes = c(0L, 0L, 0L, 0L, 0L, 0L, 
0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L), legbyes = c(0L, 0L, 0L, 0L, 
0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L), noballs = c(0L, 0L, 
0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L)), class = "data.frame", row.names = c(NA, 
-14L))

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM