I have a list of datasets. Each dataset contains one month of some data. The data span for many years, therefore I have 12 datasets for each year. This data was originally a bunch of Excel files. I have imported all the files, previously converted to .csv, following this advice , namely:
datalist <- list()
files <- list.files(pattern="\\.csv$")
for(file in files) {
stem <- gsub("\\.csv$","",file)
datalist[[stem]] <- read.csv(file)
}
So I end up with a list named datalist
containing all my datasets.
Now, my problem is that only the file names contain the actual month and year each part of data was collected, so I would like to grab the name and year from each dataset name and impute them in two new columns for that dataframe: "Year" and "Month".
All the file names, which I kept as dataframe names, follow this structure: [ month ]_[ year ]_[ ...some other text ], as for example "August_2012_foo_bar". So I figured I'd use regular expression to grab first the month then the year. My code stub is:
for(dataset in names(datalists)) {
name <- dataset
month <- strapply(name,"^([^_]*).*$")
...?
}
The regular expression "^([^_]*).*$"
grabs whatever comes before the underscore, namely the month. I get stuck when I need to assign the grabbed month to a new column of the dataset. I have tried both with assign
and cbind
, without luck.
In the end I would like to vertically merge all these datasets into one.
Thanks for any help!
You can just reference a new column and assign; R will create the column for you.
Try adding:
datalist[[stem]]$Month <- month
...
That will create a new column named "Month" and assign the month
variable to it. Note that R will courteously repeat the variable you're assigning as many times as is necessary to match the existing length of the data.frame.
So the whole loop would look like:
for(file in files) {
stem <- gsub("\\.csv$","",file)
datalist[[stem]] <- read.csv(file)
#parse out the month and year here
...
#assign to new columns
datalist[[stem]]$Month <- month
datalist[[stem]]$Year <- year
}
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.