[英]How to grepl search for the max and min timings in a string?
I have a dataset with a column containing the opening and closing times of various stores.我有一个数据集,其中有一列包含各种商店的开店时间和关店时间。 The timings are in string format Opening time - Closing time, eg: 17:00 - 21:00 |
时间为字符串格式 Opening time - Closing time,例如:17:00 - 21:00 | 11:30 - 14:30 |
11:30 - 14:30 | 11:30 - 14:30
11:30 - 14:30
I want to extract the minimum opening time within the above string, ie 11:30 and the max closing time ie 21:00.How do I do that using R?我想提取上述字符串中的最小开放时间,即 11:30 和最大关闭时间,即 21:00。我如何使用 R 来做到这一点?
DPUT:输出:
structure(list(head.timings_remapping.Opening.And.Closing.Time..40. = c("15:30 - 21:30",
"12:00 - 00:00", "11:00 - 15:00 | 16:30 - 20:45", "12:00 - 22:30",
"17:00 - 21:30", "17:00 - 21:30", "16:30 - 00:00", "16:00 - 21:15",
"16:30 - 20:30", "17:00 - 20:00", "16:00 - 23:30", "16:30 - 21:30",
"17:00 - 22:00", "17:00 - 22:00", "17:00 - 21:30", "17:00 - 21:30",
"16:00 - 00:00", "16:30 - 23:59", "11:30 - 22:30", "11:30 - 23:59",
"17:00 - 20:30", "07:30 - 12:50", "16:15 - 23:00", "09:00 - 21:00",
"10:00 - 21:00", "11:00 - 22:00", "07:00 - 12:00 | 07:00 - 13:30 | 12:00 - 13:30",
"07:00 - 13:00 | 10:00 - 15:00", "10:00 - 02:00", "00:00 - 23:59",
"00:00 - 23:59", "11:00 - 20:00", "11:00 - 20:00", NA, "12:00 - 03:30 | 11:00 - 00:00",
"05:30 - 15:00", "07:00 - 16:00", "08:30 - 13:30", "17:00 - 21:00 | 11:30 - 14:30 | 11:30 - 14:30",
"12:00 - 01:00")), class = "data.frame", row.names = c(NA, -40L
))
The final output will have two columns "Opening time" and "Closing time"最终输出将有两列“开放时间”和“关闭时间”
Does this work:这是否有效:
library(dplyr)
library(tidyr)
df %>%
separate(col = head.timings_remapping.Opening.And.Closing.Time..40., into = c('Open_Close','A'), sep = '\\|') %>%
separate(col = Open_Close, into = c('Opening Time','Closing Time'), sep = ' - ') %>%
mutate(`Opening Time` = trimws(`Opening Time`), `Closing Time` = trimws(`Closing Time`)) %>% select(-A)
Opening Time Closing Time
1 15:30 21:30
2 12:00 00:00
3 11:00 15:00
4 12:00 22:30
5 17:00 21:30
6 17:00 21:30
7 16:30 00:00
8 16:00 21:15
9 16:30 20:30
10 17:00 20:00
11 16:00 23:30
12 16:30 21:30
13 17:00 22:00
14 17:00 22:00
15 17:00 21:30
16 17:00 21:30
17 16:00 00:00
18 16:30 23:59
19 11:30 22:30
20 11:30 23:59
21 17:00 20:30
22 07:30 12:50
23 16:15 23:00
24 09:00 21:00
25 10:00 21:00
26 11:00 22:00
27 07:00 12:00
28 07:00 13:00
29 10:00 02:00
30 00:00 23:59
31 00:00 23:59
32 11:00 20:00
33 11:00 20:00
34 <NA> <NA>
35 12:00 03:30
36 05:30 15:00
37 07:00 16:00
38 08:30 13:30
39 17:00 21:00
40 12:00 01:00
Using dplyr
and tidyr
library you can do :使用
dplyr
和tidyr
库,您可以执行以下操作:
library(dplyr)
library(tidyr)
#Rename the long column name to something smaller
names(df)[1] <- 'Time'
df %>%
#Create a row index
mutate(row = row_number()) %>%
#Split the data in different rows on '|'
separate_rows(Time, sep = '\\s*\\|\\s*') %>%
#split the data on '-'
separate(Time, c("Opening_Time", "Closing_time"), sep = '\\s*-\\s*') %>%
#Change the time to POSIXct format
mutate(across(c(Opening_Time, Closing_time), as.POSIXct, format = '%H:%M')) %>%
#For each row
group_by(row) %>%
#Get minimum opening time and maximum closing time
#and change into required format
summarise(Opening_Time = format(min(Opening_Time), "%H:%M"),
Closing_time = format(max(Closing_time), "%H:%M")) %>%
#Drop row column
select(-row)
This returns这返回
# Opening_Time Closing_time
# <chr> <chr>
# 1 15:30 21:30
# 2 12:00 00:00
# 3 11:00 20:45
# 4 12:00 22:30
# 5 17:00 21:30
# 6 17:00 21:30
# 7 16:30 00:00
# 8 16:00 21:15
# 9 16:30 20:30
#10 17:00 20:00
# … with 30 more rows
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.