I want to subset a dataframe using a loop and save as csv files named as the group value

Question

Am new to R, but can't find an answer to this exact question. I have a dataframe (df) read in from a csv file that looks like this:

A,B,C,D,E,F,G,H,I,J
rabbit,brisbane,NA,8,3,2,2,6,2,10
cat,perth,NA,1,8,10,-3,3,5,7
NA,brisbane,bicycle,9,-2,7,-3,7,5,2
rat,brisbane,NA,5,-10,6,1,12,9,9
budgie,melbourne,NA,5,6,3,2,6,7,8
NA,melbourne,bicycle,11,9,0,-1,3,0,7
dog,adelaide,car,0,-4,10,3,7,4,1
rabbit,Canberra,car,5,7,10,-3,5,11,8
dog,brisbane,car,10,-10,6,3,8,0,4
rabbit,brisbane,boat,0,-3,5,2,9,3,3
rabbit,sydney,walk,7,-6,3,-1,4,10,12
cat,perth,NA,6,-4,3,0,3,NA,4
rat,Darwin,car,6,-3,10,-3,6,8,3
cat,perth,boat,7,11,1,NA,2,2,10
rabbit,sydney,NA,1,5,5,-3,2,10,4
rat,NA,walk,3,0,1,1,10,5,3
dog,brisbane,car,10,4,4,1,3,0,4
rabbit,adelaide,fly,7,-2,12,0,3,12,4
budgie,adelaide,fly,11,-9,8,3,6,2,2
rabbit,melbourne,bicycle,10,-10,1,NA,8,11,3
cat,adelaide,fly,3,10,3,-1,10,3,3
rat,sydney,fly,2,0,3,-1,0,7,7
NA,melbourne,walk,8,-1,12,-2,0,8,7
rat,sydney,walk,10,-1,8,1,7,5,3
dog,brisbane,car,10,7,7,1,10,7,11
dog,perth,bicycle,3,5,11,-3,2,0,7
dog,sydney,bicycle,11,4,1,0,12,7,0
dog,adelaide,walk,6,0,3,-2,0,12,12
rabbit,perth,boat,5,3,1,-2,1,NA,6
rabbit,NA,boat,4,9,2,3,12,3,1

I want to subset it on column A, and save the subsets as csv files named by the value in column A.

I used this bit of code, which correctly produced the files I want as members of the list 'df_split'. This is shown by typing df_split$rabbit, but the csv files produced were named as numbers (1.csv, 2.csv, .....).

# Divide a big file into parts for each value of a variable.

# Sorting file
df <- df[ order(df$A), ]  

# Splitting file and creating character names
df_split <- split(df, df$A)
new_names <- as.character(unique(df$A))

# Writing csv files for dataframes in df_split
for (i in 1:length(df_split)) {
  assign(new_names[i], df_split[[i]])
  filename = paste(i, ".csv")
  write.csv(df_split[[i]], filename, row.names = FALSE)  
}

Is there any way I can get my csv files correctly named?

Answer 1

Assuming the new_names array and the list of data frames are in the right order, you may just use:

for (i in 1:length(df_split)) {
    assign(new_names[i], df_split[[i]])
    filename = paste(new_names[i], ".csv")    # change here
    write.csv(df_split[[i]], filename, row.names = FALSE)  
}

Currently, you are using the loop numerical index to name the CSV file, rather than the actual name which exists at that index.

I want to subset a dataframe using a loop and save as csv files named as the group value

Question

1 answers

solution1
0 2018-10-18 03:30:15

I want to subset a dataframe using a loop and save as csv files named as the group value

Question

1 answers

solution1 0 2018-10-18 03:30:15

solution1
0 2018-10-18 03:30:15