I have multiple .csv files in my directory which don't have a column name. So while reading them without header gives error
Error in match.names(clabs, names(xi)) : names do not match previous names.
So for that reason, I want to append column names to those csv files and combine them all to one single dataframe, but I'm not able to add a column name to those multiple csv file while reading them. File names are like test_abc.csv
, test_pqr.csv
, test_xyz.csv
etc. here is what I tried
temp = list.files(pattern="*.csv")
read_csv_filename <- function(filename){
ret <- read.csv(filename,header = F)
ret$city <- gsub(".*[_]([^.]+)[.].*", "\\1", filename)
ret
}
df_all <- do.call(rbind,lapply(temp,read_csv_filename))
How do I add header here to every file while reading?
This is a names that I want to add while reading
colnames = c("Age","Gender","height","weight")
Any suggestion?
Using tidyverse
packages, you can do this nicely with purrr::map_dfr
function, which iterates of a list, performing some function on each elements that returns a dataframe each time, and the row-binds all those data frames together.
library(readr)
library(purrr)
library(dplyr) # only used in example set up
# Setting up some example csv files to work with
mtcars_slim <- select(mtcars, 1:3)
write_csv(slice(mtcars_slim, 1:4), "mtcars_1.csv", col_names = FALSE)
write_csv(slice(mtcars_slim, 5:10), "mtcars_2.csv", col_names = FALSE)
write_csv(slice(mtcars_slim, 11:1), "mtcars_3.csv", col_names = FALSE)
# get file paths, read them all, and row-bind them all
dir(pattern = "mtcars_\\d+\\.csv") %>%
map_dfr(read_csv, col_names = c("mpg", "cyl", "disp"))
#> Parsed with column specification:
#> cols(
#> mpg = col_double(),
#> cyl = col_integer(),
#> disp = col_integer()
#> )
#> # A tibble: 21 x 3
#> mpg cyl disp
#> <dbl> <int> <dbl>
#> 1 21.0 6 160.0
#> 2 21.0 6 160.0
#> 3 22.8 4 108.0
#> 4 21.4 6 258.0
#> 5 18.7 8 360.0
#> 6 18.1 6 225.0
#> 7 14.3 8 360.0
#> 8 24.4 4 146.7
#> 9 22.8 4 140.8
#> 10 19.2 6 167.6
#> # ... with 11 more rows
You can put colnames inside the loop itself like this
temp = list.files(pattern="*.csv")
read_csv_filename <- function(filename){
ret <- read.csv(filename,header = F)
ret$city <- gsub(".*[_]([^.]+)[.].*", "\\1", filename)
colnames(ret) <- c("Age","Gender","height","weight","city")
ret
}
df_all <- do.call(rbind,lapply(temp,read_csv_filename))
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.