简体   繁体   English

导入多个 CSV 文件,添加相同的列标题,然后 cbind R

[英]Import multiple CSV files, add same column headers and then cbind R

I'm realtively new to R and have been trying to find a working answer here for the last three hours, but just cannot seem to find a combination that works.我是 R 的新手,过去三个小时一直试图在这里找到有效的答案,但似乎找不到有效的组合。

I have a folder that contains 841 csv files, none of the files have column names.我有一个包含 841 csv 个文件的文件夹,这些文件都没有列名。 The format is the same for every file (although some of the files might have blank columns due to there simply not being any data available for said column in that file).每个文件的格式都相同(尽管某些文件可能有空白列,因为该文件中的所述列根本没有任何可用数据)。

I want to be able to read in all 841 csv files, add the column names and then cbind them into a single data frame.我希望能够读取所有 841 csv 文件,添加列名,然后将它们绑定到一个数据框中。

Bringing in a single file and adding the column names is easy enough:引入单个文件并添加列名非常简单:

col.names = c("ID", "NAMES_URI",    "NAME1",    "NAME1_LANG",   "NAME2",    "NAME2_LANG",   "TYPE", "LOCAL_TYPE",
          "GEOMETRY_X", "GEOMETRY_Y", "MOST_DETAIL_VIEW_RES", "LEAST_DETAIL_VIEW_RES",  "MBR_XMIN",
          "MBR_YMIN", "MBR_XMAX", "MBR_YMAX", "POSTCODE_DISTRICT", "POSTCODE_DISTRICT_URI",
          "POPULATED_PLACE", "POPULATED_PLACE_URI", "POPULATED_PLACE_TYPE", "DISTRICT_BOROUGH",
          "DISTRICT_BOROUGH_URI", "DISTRICT_BOROUGH_TYPE", "COUNTY_UNITARY",    "COUNTY_UNITARY_URI",
          "COUNTY_UNITARY_TYPE", "REGION", "REGION_URI", "COUNTRY", "COUNTRY_URI",  "RELATED_SPATIAL_OBJECT",
          "SAME_AS_DBPEDIA", "SAME_AS_GEONAMES")

Single_File <- fread(file = "C:/Users/djr/Desktop/PostCodes/Data/HP40.csv", header = FALSE)

setnames(Single_File, col.names)

My issue comes in when I try to read the files in as a list and bind.当我尝试将文件作为列表读取并绑定时,我的问题就出现了。 I've tried examples using lapply or map_dfr, but they always bring up error messages about the vector size not being the same or not being able to fill or about the column specification not being the same.我尝试过使用 lapply 或 map_dfr 的示例,但它们总是会显示有关向量大小不相同或无法填充或列规格不相同的错误消息。

My current code I am trying is:我正在尝试的当前代码是:

  dir(pattern = ".csv") %>% 


 map_dfr(read_csv, col_names = c("ID", "NAMES_URI",    "NAME1",    "NAME1_LANG",   "NAME2",    "NAME2_LANG",   "TYPE", "LOCAL_TYPE",
                                  "GEOMETRY_X", "GEOMETRY_Y", "MOST_DETAIL_VIEW_RES", "LEAST_DETAIL_VIEW_RES",  "MBR_XMIN",
                                  "MBR_YMIN", "MBR_XMAX", "MBR_YMAX", "POSTCODE_DISTRICT", "POSTCODE_DISTRICT_URI",
                                  "POPULATED_PLACE", "POPULATED_PLACE_URI", "POPULATED_PLACE_TYPE", "DISTRICT_BOROUGH",
                                  "DISTRICT_BOROUGH_URI", "DISTRICT_BOROUGH_TYPE", "COUNTY_UNITARY",    "COUNTY_UNITARY_URI",
                                  "COUNTY_UNITARY_TYPE", "REGION", "REGION_URI", "COUNTRY", "COUNTRY_URI",  "RELATED_SPATIAL_OBJECT",
                                  "SAME_AS_DBPEDIA", "SAME_AS_GEONAMES"))

But this just brings up loads of output in the console that is meaningless to me, it seems to be giving a summary of each file.但这只会在控制台中加载 output,这对我来说毫无意义,它似乎给出了每个文件的摘要。

Is there any simple code to bring in CSV's, add the column names to each and then cbind them all together that anyone has?是否有任何简单的代码可以引入 CSV,将列名添加到每个列名,然后将它们全部绑定在一起?

I am not 100% sure what exactly it is you need but my best guess would be something like this:我不是 100% 确定你到底需要什么,但我最好的猜测是这样的:

library(data.table)

y_path   <- 'C:/your_path/your_folder'
all_csv  <- list.files(path = y_path, pattern = '.csv', full.names = TRUE)
open_csv <- lapply(all_csv, \(x) fread(x, ...)) # ... here just signifying other arguments

one_df <- data.table::rbindlist(open_csv) 
# or: do.call(rbind, open_csv)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM