简体   繁体   English

R 中不正确的字符串替换

[英]Incorrect replacement of strings in R

I need to replace awkward strings in R, specifically the times that are in a weird format.我需要替换 R 中笨拙的字符串,特别是格式奇怪的时间。 The data looks like this:数据如下所示:

      Date |    Time | AmbientTemp
2000-01-01 | 11:00 a |          25
2000-01-01 | 11:30 a |        25.5 
2000-01-01 | 11:00 p |          20
2000-01-01 | 11:30 p |        19.5

The a and p mean AM and PM respectively (obviously). ap分别表示 AM 和 PM(显然)。

lubridate and base R cannot convert these dates to a correct format. lubridatebase R 无法将这些日期转换为正确的格式。 Thus, I turned to the cumbersome str_replace_all function (from package stringr ) to convert ALL my times in a large dataframe: >130000 records.因此,我求助于繁琐的str_replace_all函数(来自包stringr )在一个大数据stringr转换我所有的时间:> 130000 条记录。

Example functions:示例函数:

uploadDat$Time = str_replace_all(uploadDat$Time,"11:00 a","11:00")
uploadDat$Time = str_replace_all(uploadDat$Time,"11:00 p","23:00")

I changed the class of the times using as.character() before applying stringr 's functions.在应用stringr的函数之前,我使用as.character()更改了时间类。

The result is perfect except for the 11'o clock times (like above) that are converted as follow:结果是完美的,除了 11 点钟时间(如上)转换如下:

      Date |   Time | AmbientTemp
2000-01-01 | 101:00 |          25
2000-01-01 | 101:30 |        25.5 
2000-01-01 | 113:30 |          20
2000-01-01 | 113:30 |        19.5

Why are these specific times converted incorrectly?为什么这些特定时间转换不正确?

We can paste "m" at the end of time, convert it into POSIXct我们可以在时间末尾paste "m" ,将其转换为POSIXct

format(as.POSIXct(paste0(df$Time, "m"), format = "%I:%M %p"), "%T")
#[1] "11:00:00" "11:30:00" "23:00:00" "23:30:00"

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM