简体   繁体   中英

R: Remove “.” and 0 in year column in a data frame

I have a quite basic question, but I am not sure how to handle it in a smart way.

I have a column with quarterly year dates of the form yyyy.0q (y for year and q for quarter). Example: 1990.01, 1990.02, 1990.03,...

I want to change the format to: 19901,19902,19903,... So, basically to remove the "." and the foregoing "0". The column is numeric.

Is there a fast and convenient way to solve this problem?

library(tidyverse)

dat <- data.frame(x = c("1990.01", "1990.02", "1990.03"))

dat %>%
  mutate(x2 = str_replace(x, "\\.0", ""))

which gives:

        x    x2
1 1990.01 19901
2 1990.02 19902
3 1990.03 19903

We can use str_remove

library(dplyr)
library(stringr)
dat %>%
     mutate(x2 = str_remove(x, "\\.0"))

data

dat <- data.frame(x = c("1990.01", "1990.02", "1990.03"))

There are already two answers, by deschen and akrun and one in comment ( mine ). Here is a speed comparison run with package microbenchmark .

First, my solution.

x <- scan(what = character(), text = "1990.01 1990.02 1990.03")
sub("\\..", "", x)
#[1] "19901" "19902" "19903"

Now the test. I will create a bigger vector.

library(microbenchmark)

x2 <- x
for(i in 1:log2(1e4/nchar(x))[1]) x2 <- c(x2, x2)

mb <- microbenchmark(
  base_Rui = sub("\\.0", "", x2),
  stringr_deschen = str_replace(x2, "\\.0", ""),
  stringr_akrun = str_remove(x2, "\\.0")
)
print(mb, order = "median")
#Unit: milliseconds
#            expr      min       lq     mean   median       uq       max neval cld
#        base_Rui 2.060452 2.274474 2.531059 2.310165 2.410621  6.503303   100  a 
# stringr_deschen 2.092459 4.181407 4.598719 4.265935 4.390778 11.885202   100   b
#   stringr_akrun 3.754172 4.194410 4.624510 4.283582 4.499489  9.093461   100   b

With small vectors the difference is more impressive, try it on x above.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM