I have several csv files all named with dates and for all of them I want to create a new column in each file that contains data from two other columns placed together. Then, I want to combine them into one big dataframe and choose only two of those columns to keep. Here's an example:
Say I have two dataframes:
a b c a b c
x 1 2 3 x 3 2 1
y 2 3 1 y 2 1 3
Then I want to create a new column d in each of them:
a b c d a b c d
x 1 2 3 13 x 3 2 1 31
y 2 3 1 21 y 2 1 3 23
Then I want to combine them like this:
a b c d
x 1 2 3 13
y 2 3 1 21
x 3 2 1 31
y 2 1 3 23
Then keep two of the columns a and d and delete the other two columns b and c:
a d
x 1 13
y 2 21
x 3 31
y 2 23
Here is my current code (It doesn't work when I try to combine two of the columns or when I try to only keep two of the columns):
f <- list.files(pattern="201\\d{5}\\.csv") # reading in all the files
mydata <- sapply(f, read.csv, simplify=FALSE) # assigning them to a dataframe
do.call(rbind,mydata) # combining all of those dataframes into one
mydata$Data <- paste(mydata$LAST_UPDATE_DT,mydata$px_last) # combining two of the columns into a new column named "Data"
c('X','Data') %in% names(mydata) # keeping two of the columns while deleting the rest
The object mydata
is a list of data frames. You can change the data frames in the list with lapply
:
lapply(mydata, function(x) "[<-"(x, "c", value = paste0(x$a, x$b)))
file1 <- "a b
x 2 3"
file2 <- "a b
x 3 1"
mydata <- lapply(c(file1, file2), function(x) read.table(text = x, header =TRUE))
lapply(mydata, function(x) "[<-"(x, "c", value = paste0(x$a, x$b)))
# [[1]]
# a b c
# x 2 3 23
#
# [[2]]
# a b c
# x 3 1 31
You can use rbind (data1,data2)[,c(1,3)]
for that. I assume that you can create col d
in each dataframe which is a basic thing.
data1<-structure(list(a = 1:2, b = 2:3, c = c(3L, 1L), d = c(13L, 21L
)), .Names = c("a", "b", "c", "d"), row.names = c("x", "y"), class = "data.frame")
> data1
a b c d
x 1 2 3 13
y 2 3 1 21
data2<-structure(list(a = c(3L, 2L), b = c(2L, 1L), c = c(1L, 3L), d = c(31L,
23L)), .Names = c("a", "b", "c", "d"), row.names = c("x", "y"
), class = "data.frame")
> data2
a b c d
x 3 2 1 31
y 2 1 3 23
data3<-rbind(data1,data2)
> data3
a b c d
x 1 2 3 13
y 2 3 1 21
x1 3 2 1 31
y1 2 1 3 23
finaldata<-data3[,c("a","d")]
> finaldata
a d
x 1 13
y 2 21
x1 3 31
y1 2 23
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.