简体   繁体   中英

How do you combine two columns into a new column in a dataframe made of two or more different csv files?

I have several csv files all named with dates and for all of them I want to create a new column in each file that contains data from two other columns placed together. Then, I want to combine them into one big dataframe and choose only two of those columns to keep. Here's an example:

Say I have two dataframes:

  a b c        a b c
x 1 2 3      x 3 2 1
y 2 3 1      y 2 1 3

Then I want to create a new column d in each of them:

  a b c  d        a b c  d
x 1 2 3 13      x 3 2 1 31
y 2 3 1 21      y 2 1 3 23

Then I want to combine them like this:

  a b c  d
x 1 2 3 13
y 2 3 1 21
x 3 2 1 31
y 2 1 3 23

Then keep two of the columns a and d and delete the other two columns b and c:

  a  d
x 1 13
y 2 21
x 3 31
y 2 23

Here is my current code (It doesn't work when I try to combine two of the columns or when I try to only keep two of the columns):

    f <- list.files(pattern="201\\d{5}\\.csv")        # reading in all the files
    mydata <- sapply(f, read.csv, simplify=FALSE)     # assigning them to a dataframe
    do.call(rbind,mydata)                             # combining all of those dataframes into one
    mydata$Data <- paste(mydata$LAST_UPDATE_DT,mydata$px_last)   # combining two of the columns into a new column named "Data"
    c('X','Data') %in% names(mydata)               # keeping two of the columns while deleting the rest

The object mydata is a list of data frames. You can change the data frames in the list with lapply :

lapply(mydata, function(x) "[<-"(x, "c", value = paste0(x$a, x$b)))

file1 <- "a b             
x 2 3"    
file2 <- "a b
x 3 1"
mydata <- lapply(c(file1, file2), function(x) read.table(text = x, header =TRUE))
lapply(mydata, function(x) "[<-"(x, "c", value = paste0(x$a, x$b)))

# [[1]]
#   a b  c
# x 2 3 23
# 
# [[2]]
#   a b  c
# x 3 1 31

You can use rbind (data1,data2)[,c(1,3)] for that. I assume that you can create col d in each dataframe which is a basic thing.

 data1<-structure(list(a = 1:2, b = 2:3, c = c(3L, 1L), d = c(13L, 21L
    )), .Names = c("a", "b", "c", "d"), row.names = c("x", "y"), class = "data.frame")

 > data1
      a b c  d
    x 1 2 3 13
    y 2 3 1 21   

data2<-structure(list(a = c(3L, 2L), b = c(2L, 1L), c = c(1L, 3L), d = c(31L, 
23L)), .Names = c("a", "b", "c", "d"), row.names = c("x", "y"
), class = "data.frame")

> data2
  a b c  d
x 3 2 1 31
y 2 1 3 23

data3<-rbind(data1,data2)

    > data3
   a b c  d
x  1 2 3 13
y  2 3 1 21
x1 3 2 1 31
y1 2 1 3 23

finaldata<-data3[,c("a","d")]
    > finaldata
   a  d
x  1 13
y  2 21
x1 3 31
y1 2 23

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM