简体   繁体   中英

I want to subset a dataframe using a loop and save as csv files named as the group value

Am new to R, but can't find an answer to this exact question. I have a dataframe (df) read in from a csv file that looks like this:

A,B,C,D,E,F,G,H,I,J
rabbit,brisbane,NA,8,3,2,2,6,2,10
cat,perth,NA,1,8,10,-3,3,5,7
NA,brisbane,bicycle,9,-2,7,-3,7,5,2
rat,brisbane,NA,5,-10,6,1,12,9,9
budgie,melbourne,NA,5,6,3,2,6,7,8
NA,melbourne,bicycle,11,9,0,-1,3,0,7
dog,adelaide,car,0,-4,10,3,7,4,1
rabbit,Canberra,car,5,7,10,-3,5,11,8
dog,brisbane,car,10,-10,6,3,8,0,4
rabbit,brisbane,boat,0,-3,5,2,9,3,3
rabbit,sydney,walk,7,-6,3,-1,4,10,12
cat,perth,NA,6,-4,3,0,3,NA,4
rat,Darwin,car,6,-3,10,-3,6,8,3
cat,perth,boat,7,11,1,NA,2,2,10
rabbit,sydney,NA,1,5,5,-3,2,10,4
rat,NA,walk,3,0,1,1,10,5,3
dog,brisbane,car,10,4,4,1,3,0,4
rabbit,adelaide,fly,7,-2,12,0,3,12,4
budgie,adelaide,fly,11,-9,8,3,6,2,2
rabbit,melbourne,bicycle,10,-10,1,NA,8,11,3
cat,adelaide,fly,3,10,3,-1,10,3,3
rat,sydney,fly,2,0,3,-1,0,7,7
NA,melbourne,walk,8,-1,12,-2,0,8,7
rat,sydney,walk,10,-1,8,1,7,5,3
dog,brisbane,car,10,7,7,1,10,7,11
dog,perth,bicycle,3,5,11,-3,2,0,7
dog,sydney,bicycle,11,4,1,0,12,7,0
dog,adelaide,walk,6,0,3,-2,0,12,12
rabbit,perth,boat,5,3,1,-2,1,NA,6
rabbit,NA,boat,4,9,2,3,12,3,1

I want to subset it on column A, and save the subsets as csv files named by the value in column A.

I used this bit of code, which correctly produced the files I want as members of the list 'df_split'. This is shown by typing df_split$rabbit, but the csv files produced were named as numbers (1.csv, 2.csv, .....).

# Divide a big file into parts for each value of a variable.

# Sorting file
df <- df[ order(df$A), ]  

# Splitting file and creating character names
df_split <- split(df, df$A)
new_names <- as.character(unique(df$A))

# Writing csv files for dataframes in df_split
for (i in 1:length(df_split)) {
  assign(new_names[i], df_split[[i]])
  filename = paste(i, ".csv")
  write.csv(df_split[[i]], filename, row.names = FALSE)  
}

Is there any way I can get my csv files correctly named?

Assuming the new_names array and the list of data frames are in the right order, you may just use:

for (i in 1:length(df_split)) {
    assign(new_names[i], df_split[[i]])
    filename = paste(new_names[i], ".csv")    # change here
    write.csv(df_split[[i]], filename, row.names = FALSE)  
}

Currently, you are using the loop numerical index to name the CSV file, rather than the actual name which exists at that index.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM