简体   繁体   English

如何grepl搜索字符串中的最大和最小时间?

[英]How to grepl search for the max and min timings in a string?

I have a dataset with a column containing the opening and closing times of various stores.我有一个数据集,其中有一列包含各种商店的开店时间和关店时间。 The timings are in string format Opening time - Closing time, eg: 17:00 - 21:00 |时间为字符串格式 Opening time - Closing time,例如:17:00 - 21:00 | 11:30 - 14:30 | 11:30 - 14:30 | 11:30 - 14:30 11:30 - 14:30

I want to extract the minimum opening time within the above string, ie 11:30 and the max closing time ie 21:00.How do I do that using R?我想提取上述字符串中的最小开放时间,即 11:30 和最大关闭时间,即 21:00。我如何使用 R 来做到这一点?

DPUT:输出:

 structure(list(head.timings_remapping.Opening.And.Closing.Time..40. = c("15:30 - 21:30", 
"12:00 - 00:00", "11:00 - 15:00 | 16:30 - 20:45", "12:00 - 22:30", 
"17:00 - 21:30", "17:00 - 21:30", "16:30 - 00:00", "16:00 - 21:15", 
"16:30 - 20:30", "17:00 - 20:00", "16:00 - 23:30", "16:30 - 21:30", 
"17:00 - 22:00", "17:00 - 22:00", "17:00 - 21:30", "17:00 - 21:30", 
"16:00 - 00:00", "16:30 - 23:59", "11:30 - 22:30", "11:30 - 23:59", 
"17:00 - 20:30", "07:30 - 12:50", "16:15 - 23:00", "09:00 - 21:00", 
"10:00 - 21:00", "11:00 - 22:00", "07:00 - 12:00 | 07:00 - 13:30 | 12:00 - 13:30", 
"07:00 - 13:00 | 10:00 - 15:00", "10:00 - 02:00", "00:00 - 23:59", 
"00:00 - 23:59", "11:00 - 20:00", "11:00 - 20:00", NA, "12:00 - 03:30 | 11:00 - 00:00", 
"05:30 - 15:00", "07:00 - 16:00", "08:30 - 13:30", "17:00 - 21:00 | 11:30 - 14:30 | 11:30 - 14:30", 
"12:00 - 01:00")), class = "data.frame", row.names = c(NA, -40L
))

The final output will have two columns "Opening time" and "Closing time"最终输出将有两列“开放时间”和“关闭时间”

Does this work:这是否有效:

library(dplyr)
library(tidyr)
df %>% 
   separate(col = head.timings_remapping.Opening.And.Closing.Time..40., into = c('Open_Close','A'), sep = '\\|') %>% 
   separate(col = Open_Close, into = c('Opening Time','Closing Time'), sep = ' - ') %>% 
   mutate(`Opening Time` = trimws(`Opening Time`), `Closing Time` = trimws(`Closing Time`)) %>% select(-A)
   Opening Time Closing Time
1         15:30        21:30
2         12:00        00:00
3         11:00        15:00
4         12:00        22:30
5         17:00        21:30
6         17:00        21:30
7         16:30        00:00
8         16:00        21:15
9         16:30        20:30
10        17:00        20:00
11        16:00        23:30
12        16:30        21:30
13        17:00        22:00
14        17:00        22:00
15        17:00        21:30
16        17:00        21:30
17        16:00        00:00
18        16:30        23:59
19        11:30        22:30
20        11:30        23:59
21        17:00        20:30
22        07:30        12:50
23        16:15        23:00
24        09:00        21:00
25        10:00        21:00
26        11:00        22:00
27        07:00        12:00
28        07:00        13:00
29        10:00        02:00
30        00:00        23:59
31        00:00        23:59
32        11:00        20:00
33        11:00        20:00
34         <NA>         <NA>
35        12:00        03:30
36        05:30        15:00
37        07:00        16:00
38        08:30        13:30
39        17:00        21:00
40        12:00        01:00
 

Using dplyr and tidyr library you can do :使用dplyrtidyr库,您可以执行以下操作:

library(dplyr)
library(tidyr)

#Rename the long column name to something smaller
names(df)[1] <- 'Time'

df %>%
  #Create a row index
  mutate(row = row_number()) %>%
  #Split the data in different rows on '|'
  separate_rows(Time, sep = '\\s*\\|\\s*') %>%
  #split the data on '-'
  separate(Time, c("Opening_Time", "Closing_time"), sep = '\\s*-\\s*') %>%
  #Change the time to POSIXct format
  mutate(across(c(Opening_Time, Closing_time), as.POSIXct, format = '%H:%M')) %>%
  #For each row
  group_by(row) %>%
  #Get minimum opening time and maximum closing time 
  #and change into required format
  summarise(Opening_Time = format(min(Opening_Time), "%H:%M"), 
            Closing_time = format(max(Closing_time), "%H:%M")) %>%
  #Drop row column
  select(-row)

This returns这返回

#  Opening_Time Closing_time
#   <chr>        <chr>       
# 1 15:30        21:30       
# 2 12:00        00:00       
# 3 11:00        20:45       
# 4 12:00        22:30       
# 5 17:00        21:30       
# 6 17:00        21:30       
# 7 16:30        00:00       
# 8 16:00        21:15       
# 9 16:30        20:30       
#10 17:00        20:00       
# … with 30 more rows

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM