指定使用 map_dfr 时哪些列相同

Question

I have two folders each with hundreds of CSVs and I want to merge them all in one data frame.我有两个文件夹，每个文件夹都有数百个 CSV，我想将它们全部合并到一个数据框中。 I have used the following:我使用了以下内容：

tbl <-
  list.files(path = c("./reports_0", "./reports_1"),
             pattern = "*.csv", 
             full.names = T) %>%
  map_dfr(~read_csv(., col_types = cols(.default = "c")))

Now I realized that some of those CSVs have their column name as Firmware Version and some as Firmware version (upper and lowercase).现在我意识到其中一些 CSV 的列名是固件版本，有些是固件版本（大写和小写）。

I would like to specify that those are the same and can be combined in one called Firmware Version.我想指定它们是相同的，并且可以组合成一个称为固件版本的版本。

the这

by =

does not work and I could not find a solution.不起作用，我找不到解决方案。

Hope there is someone that can help, thanks!希望有人能帮忙，谢谢！

EDIT编辑
My workaround is:我的解决方法是：

tbl <- tbl %>% 
  unite(`Firmware Version`, `Firmware version`, na.rm = T) %>% 
  mutate(`Firmware Version` = replace(`Firmware Version`, `Firmware Version`=="", NA_character_))

However, I still wonder whether there is a nicer, more straightforward way.但是，我仍然想知道是否有更好，更直接的方法。

Answer 1

you could use janitor::make_clean_names() to convert columnnames to the same format (for example camelCase), and then rowbind.您可以使用janitor::make_clean_names()将列名转换为相同的格式（例如 camelCase），然后进行行绑定。

for example:例如：

library(data.table)
library(janitor)
ftr <- list.files(path = c("./reports_0", "./reports_1"), 
   pattern = ".*\\.csv$", 
   names = TRUE)

DT <- rbindlist(
  lapply(ftr, function(x) {
    tempDT <- fread(x)
    setnames(tempDT, names(tempDT), janitor::make_clean_names(names(tempDT)))
    return(tempDT)
  }), use.names = TRUE, fill = TRUE)

proof of concept概念证明

convert names to snake_case将名称转换为 snake_case

> janitor::make_clean_names("Firmware Version")
[1] "firmware_version"
> janitor::make_clean_names("Firmware version")
[1] "firmware_version"

指定使用 map_dfr 时哪些列相同

问题描述

1 个解决方案

解决方案1
1 2022-06-24 11:25:36

指定使用 map_dfr 时哪些列相同

问题描述

1 个解决方案

解决方案1 1 2022-06-24 11:25:36

解决方案1
1 2022-06-24 11:25:36