![](/img/trans.png)
[英]Appending a row to a dataframe while reading from multiple csv files in R
[英]How to add filename as a column to csv while reading & appending multiple csv's in r?
我正在嘗試讀取 1000 個 csv 文件和 append 並將它們保存為一個rds文件。
問題:我正在嘗試將文件名添加為每個csv
中的列,以便我知道哪些數據來自哪個 csv 文件(所有文件都有相同的列)但無法這樣做。
使用的示例 csv :
# Setting up some example csv files to work with
mtcars_slim <- select(mtcars, 1:3)
write_csv(slice(mtcars_slim, 1:4), "Input data/sub folder/AA_1.csv")
write_csv(slice(mtcars_slim, 5:10), "Input data/sub folder/AAA_2.csv")
write_csv(slice(mtcars_slim, 11:1), "Input data/sub folder/BBB_3.csv")
我在下面嘗試過的代碼:
# this code worked but it doesn't have filename within the dataset
list.files(path = "Input data/sub folder/",
pattern="*.csv",
full.names = T) %>%
map_df(~read_csv(.)) %>%
saveRDS("output_compiled_data.rds")
因此,我嘗試修改上面的代碼以將文件名作為列包含在下面的代碼塊中的每個 csv 文件中,但它不起作用。
file_names <- list.files(path = "Input data/sub folder/",
pattern="*.csv",
full.names = T) %>%
map_df(file_names, ~read_csv(.) %>%
mutate(symbol = file_names)) %>%
saveRDS("output_compiled_data.rds")
data_tbl <- read_rds("output_compiled_data.rds")
data_tbl
一種選擇是使用文件名的命名列表。 之后,您可以通過map_df
的.id
參數添加帶有文件名的列:
library(dplyr)
library(purrr)
library(readr)
mtcars_slim <- select(mtcars, 1:3)
write_csv(slice(mtcars_slim, 1:4), "AA_1.csv")
write_csv(slice(mtcars_slim, 5:10), "AAA_2.csv")
write_csv(slice(mtcars_slim, 11:1), "BBB_3.csv")
fn <- list.files(
path = ".",
pattern = "\\.csv",
full.names = T
)
names(fn) <- basename(fn)
map_df(fn, ~ read_csv(., show_col_types = FALSE), .id = "file")
#> # A tibble: 21 × 4
#> file mpg cyl disp
#> <chr> <dbl> <dbl> <dbl>
#> 1 AA_1.csv 21 6 160
#> 2 AA_1.csv 21 6 160
#> 3 AA_1.csv 22.8 4 108
#> 4 AA_1.csv 21.4 6 258
#> 5 AAA_2.csv 18.7 8 360
#> 6 AAA_2.csv 18.1 6 225
#> 7 AAA_2.csv 14.3 8 360
#> 8 AAA_2.csv 24.4 4 147.
#> 9 AAA_2.csv 22.8 4 141.
#> 10 AAA_2.csv 19.2 6 168.
#> # … with 11 more rows
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.