简体   繁体   中英

Split R Dataframe into many new dataframes by columns

I want to split my data frame in multiple dataframes based on columns. There are ~37 columns the first 2 columns are identifiers Lab and sample number, the remaining columns are elemental/analyte information. I want a new dataframe from for each element/analyte, The first 2 columns will be the same for each new dataframe and the element/analyte name to form the new dataframe name.

A snippet of the first 10 rows of the data frame is provided below. There is lots of missing data which I can deal with separately.

structure(list(Lab = c("I", "I", "K", "K", "M", "M", "O", "O", 
"P", "P"), Sample.Num = c("1", "2", "1", "2", "1", "2", "1", 
"2", "1", "2"), FeWet = c("", "", "", "", "", "", "62.16 ", "62.19 ", 
"", ""), FeXRF = c("62.54 ", "62.53 ", "62.91 ", "63.20 ", "62.42 ", 
"62.32 ", "", "", "", ""), FeCalc = c("62.53 ", "62.52 ", "62.42 ", 
"62.44 ", "62.40 ", "62.31 ", "62.40 ", "62.34 ", "", ""), Fe.II. = c("", 
"", "", "", "", "", "", "", "", ""), SiO2 = c("3.631 ", "3.627 ", 
"3.731 ", "3.739 ", "3.656 ", "3.669 ", "3.618 ", "3.643 ", "", 
""), CaO = c("0.056 ", "0.053 ", "0.043 ", "0.043 ", "0.053 ", 
"0.053 ", "0.047 ", "0.047 ", "", ""), Mn = c("0.4160 ", "0.4140 ", 
"0.4098 ", "0.4105 ", "0.4005 ", "0.4022 ", "0.421 ", "0.417 ", 
"", ""), Al2O3 = c("2.127 ", "2.134 ", "2.135 ", "2.118 ", "2.133 ", 
"2.140 ", "2.083 ", "2.116 ", "", ""), TiO2 = c("0.103 ", "0.102 ", 
"0.106 ", "0.106 ", "0.101 ", "0.101 ", "0.102 ", "0.103 ", "", 
""), MgO = c("0.074 ", "0.081 ", "0.087 ", "0.083 ", "0.077 ", 
"0.081 ", "0.082 ", "0.088 ", "", ""), P = c("0.0640 ", "0.0630 ", 
"0.0592 ", "0.0592 ", "0.0643 ", "0.0642 ", "0.063 ", "0.063 ", 
"", ""), S = c("0.022 ", "0.023 ", "0.021 ", "0.018 ", "0.020 ", 
"0.021 ", "0.020 ", "0.021 ", "", ""), K2O = c("0.0410 ", "0.0410 ", 
"0.0409 ", "0.0410 ", "0.0427 ", "0.0436 ", "0.044 ", "0.043 ", 
"", ""), Sn = c("", "", "0.0007 ", "0.0018 ", "0.0008 ", "0.0027 ", 
"", "", "", ""), V = c("", "", "0.0207 ", "0.0212 ", "0.0200 ", 
"0.0201 ", "", "", "", ""), Cr = c("", "", "0.0045 ", "0.0050 ", 
"0.0034 ", "0.0048 ", "0.0033 ", "0.0020 ", "", ""), Co = c("", 
"", "-0.0160 ", "-0.0167 ", "0.0027 ", "0.0030 ", "", "", "", 
""), Ni = c("", "", "0.0003 ", "0.0004 ", "0.0022 ", "0.0029 ", 
"", "", "", ""), Cu = c("", "", "0.0013 ", "0.0011 ", "0.0008 ", 
"0.0008 ", "", "", "", ""), Zn = c("", "", "0.0040 ", "0.0038 ", 
"0.0029 ", "0.0029 ", "", "", "", ""), As = c("", "", "0.0015 ", 
"0.0016 ", "0.0024 ", "0.0027 ", "", "", "", ""), Pb = c("", 
"", "0.0026 ", "0.0027 ", "0.0029 ", "0.0046 ", "", "", "", ""
), Ba = c("", "", "0.0132 ", "0.0161 ", "0.0119 ", "0.0143 ", 
"", "", "", ""), LOI.371 = c("", "", "", "", "", "", "", "", 
"", ""), LOI.425 = c("", "", "", "", "2.95 ", "3.05 ", "", "", 
"", ""), LOI.650 = c("", "", "", "", "3.50 ", "3.60 ", "", "", 
"", ""), LOI.1000 = c("3.78 ", "3.79 ", "", "", "", "", "3.91 ", 
"3.93 ", "", ""), LOI.1000.7764. = c("", "", "", "", "", "", 
"", "", "", ""), LOI.1000.2596. = c("", "", "3.78 ", "3.77 ", 
"3.76 ", "3.85 ", "", "", "", ""), Na2O = c("", "", "0.0121 ", 
"0.0115 ", "0.0418 ", "0.0419 ", "0.1160 ", "0.1126 ", "", ""
), Cl = c("", "", "0.0095 ", "0.0081 ", "0.0154 ", "0.0151 ", 
"", "", "", ""), Sr = c("", "", "", "", "0.0621 ", "0.0628 ", 
"", "", "", ""), Zr = c("", "", "", "", "(0.000)", "(0.000)", 
"", "", "", ""), Ctot = c("", "", "", "", "", "", "", "", "", 
""), Stot = c("", "", "", "", "", "", "", "", "", "")), row.names = c(NA, 
10L), class = "data.frame")

I extracted the columns names that I want for the new dataframe thinking I could use that to make a for loop through the dataframe but wasn't really sure how to assign the new dataframe.

newdfList <- colnames(df[,-c(1:2)])

I got lost as to what to do next so tried the following

for (i in 3:NCOL(df)){
  i <- data.frame(df[,c(1:2,(i))])
}

I thought of subsetting and splitting options but do not know how to iteratively go through each desired column to get a new df

Thanks

Using lapply you can create list of dataframes.

result <- lapply(3:ncol(df), function(x) df[, c(1:2, x)])

If you need to create separate dataframes, assign names to the list and use list2env .

names(result) <- names(df)[3:ncol(df)]
list2env(result, .GlobalEnv)

We can use split.default

result <- lapply(split.default(df[-(1:2)], 3:ncol(df)), function(x) cbind(df[1:2], x) )

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM