简体   繁体   中英

How to filter all rows between 2 patterns using dplyr in R

I would like to filter all rows between 2 patterns which follow a numerical order. For eg how could I filter all rows > 1st.7.1.* & < 1st.13.1.*

Here is how the dataframe looks like

在此处输入图片说明

We may use parse_number to get the numeric part and then do the filter

library(dplyr)
df1 %>%
    filter(between(readr::parse_number(ball), 7.1, 13.1))

Or another option is to extract the substring and filter

library(stringr)
df1 %>% 
   filter(between(as.numeric(str_extract(ball, "\\d+(\\.\\d+)?$")), 7.1, 13.1))

-output

# A tibble: 61 × 2
   ball    team       
   <chr>   <chr>      
 1 1st.7.1 New Zealand
 2 1st.7.2 New Zealand
 3 1st.7.3 New Zealand
 4 1st.7.4 New Zealand
 5 1st.7.5 New Zealand
 6 1st.7.6 New Zealand
 7 1st.7.7 New Zealand
 8 1st.7.8 New Zealand
 9 1st.7.9 New Zealand
10 1st.8   New Zealand
# … with 51 more rows

data

df1 <- tibble(ball = str_c('1st.', seq(0.1, 13.5, by = 0.1)), team = 'New Zealand')

You can extract the numerical part and subset on this:

library(stringr)
df %>%
  mutate(num = as.numeric(str_extract(ball, "(?<=st\\.).*"))) %>%
  filter(num > 7.1 & num < 13.1) %>%
  select(-num)
     ball
1 1st.10.9
2 1st.12.7

Data:

df <- data.frame(
  ball = c("1st.7.1","1st.7.9", "1st.12.7", "1st.13.1")
)

We could remove the constant 1st. and use the numbers. Here I changed the range to show the effect on the the provided data.

library(dplyr)
library(stringr)
df %>% 
  filter(between(as.numeric(stringr::str_remove(ball, "1st.")), 0.1, 1.1))
     ball        team     batsman              bowler  nonStriker byes legbyes noballs
1 1st.0.1 New Zealand  MJ Guptill Shaheen Shah Afridi DJ Mitchell    0       0       0
2 1st.0.2 New Zealand  MJ Guptill Shaheen Shah Afridi DJ Mitchell    0       0       0
3 1st.0.3 New Zealand  MJ Guptill Shaheen Shah Afridi DJ Mitchell    0       0       0
4 1st.0.4 New Zealand  MJ Guptill Shaheen Shah Afridi DJ Mitchell    0       0       0
5 1st.0.5 New Zealand  MJ Guptill Shaheen Shah Afridi DJ Mitchell    0       0       0
6 1st.0.6 New Zealand  MJ Guptill Shaheen Shah Afridi DJ Mitchell    0       0       0
7 1st.1.1 New Zealand DJ Mitchell          Imad Wasim  MJ Guptill    0       0       0

structure(list(ball = c("1st.0.1", "1st.0.2", "1st.0.3", "1st.0.4", 
"1st.0.5", "1st.0.6", "1st.1.1", "1st.1.2", "1st.1.3", "1st.1.4", 
"1st.1.5", "1st.1.6", "1st.2.1", "1st.2.2"), team = c("New Zealand", 
"New Zealand", "New Zealand", "New Zealand", "New Zealand", "New Zealand", 
"New Zealand", "New Zealand", "New Zealand", "New Zealand", "New Zealand", 
"New Zealand", "New Zealand", "New Zealand"), batsman = c("MJ Guptill", 
"MJ Guptill", "MJ Guptill", "MJ Guptill", "MJ Guptill", "MJ Guptill", 
"DJ Mitchell", "DJ Mitchell", "MJ Guptill", "MJ Guptill", "DJ Mitchell", 
"MJ Guptill", "DJ Mitchell", "DJ Mitchell"), bowler = c("Shaheen Shah Afridi", 
"Shaheen Shah Afridi", "Shaheen Shah Afridi", "Shaheen Shah Afridi", 
"Shaheen Shah Afridi", "Shaheen Shah Afridi", "Imad Wasim", "Imad Wasim", 
"Imad Wasim", "Imad Wasim", "Imad Wasim", "Imad Wasim", "Shaheen Shah Afridi", 
"Shaheen Shah Afrid"), nonStriker = c("DJ Mitchell", "DJ Mitchell", 
"DJ Mitchell", "DJ Mitchell", "DJ Mitchell", "DJ Mitchell", "MJ Guptill", 
"MJ Guptill", "DJ Mitchell", "DJ Mitchell", "MJ Guptill", "DJ Mitchell", 
"MJ Guptill", "MJ Guptill"), byes = c(0L, 0L, 0L, 0L, 0L, 0L, 
0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L), legbyes = c(0L, 0L, 0L, 0L, 
0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L), noballs = c(0L, 0L, 
0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L)), class = "data.frame", row.names = c(NA, 
-14L))

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM