简体   繁体   中英

R Transformation of column values within dataframe

I have questions regarding the transformation of a column value (C1 column) to several other values, as follows:

C1 = c("280804-6759","180604-8084") 
C2 = c("280804","180604")
C3 = c("28.08.04","18.06.04")
C4 = c("28-08-04","18-06-04")
C5 = c(0,1)
df = data.frame(C1, C2, C3, C4,C5)

      C1        C2      C3     C4      C5
1 280804-6759 280804 28.08.04 28-08-04  0
2 180604-8084 180604 18.06.04 18-06-04  1
  1. C1 to C2 : how to remove the hyphen and the digits that follow.
  2. C2 to C3 : insert "." between every two digits
  3. C2 to C4 : insert "-" between every two digits (allowing for conversion to date, and use of the timeDate package)
  4. C5 :
    • if the last digit is uneven (ie 1,3,5,7,9) in C1, return "0"
    • if the last digit is even including 0 (ie 0,2,4,6,8) in C1, return "1"

Hope you can help. Thanks in advance.

Sincerily, ykl

In base R, you can use within . Below, I assume "df" is just the first column in your sample data.

df <- data.frame(C1)

within(df, {
  C2 <- gsub("-.*$", "", C1)
  C3 <- gsub("(..)(..)(..)", "\\1.\\2.\\3", C2)
  C4 <- gsub("\\.", "-", C3)
  C5 <- as.numeric(grepl("[02468]$", C1))
})[paste("C", 1:5, sep = "")]
#            C1     C2       C3       C4 C5
# 1 280804-6759 280804 28.08.04 28-08-04  0
# 2 180604-8084 180604 18.06.04 18-06-04  1

Same approach, but with "dplyr":

library(dplyr)
df %>%
  mutate(C2 = gsub("-.*$", "", C1),
         C3 = gsub("(..)(..)(..)", "\\1.\\2.\\3", C2),
         C4 = gsub("\\.", "-", C3),
         C5 = as.numeric(grepl("[02468]$", C1)))

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM