简体   繁体   中英

Calculate and write Mean and SD for columns in R

I'm currently working with data in a *csv. I've got an effective script to plot my data already, but I'm stumped by what seems to be the simplest task. I'm trying to write a script that takes my data (arranged in columns) and have it calculate the mean by column and write it to a new document(./testAVG).

Also, I'm trying to take the same data, calculate the SD (by column) and append that data to the end of the original document (preferably in a repeat for the total number of rows of data I have).

Here's the script I have so far:

#Number of lines with data 
Nlines = 5
#Number of lines to skip
Nskip = 0

chem <- read.table("./test.csv", skip=Nskip, sep=",", col.names = c("Sample", "SiO2", "Al2O3", "FeO", "MgO", "CaO", "Na2O", "K2O", "Total", "eSiO2", "eAl2O3", "eFeO", "eMgO", "eCaO", "eNa2O", "eK2O"), fill=TRUE, header = TRUE, nrow=Nlines)

sd1 <- sd(chem$SiO2)
sd2 <- sd(chem$Al2O3)
sd3 <- sd(chem$FeO)
sd4 <- sd(chem$MgO)
sd5 <- sd(chem$CaO)
sd6 <- sd(chem$Na2O)
sd7 <- sd(chem$K2O)

avg1 <- colMeans(chem$SiO2, na.rm = FALSE, dims=1)
avg2 <- colMeans(chem$Al2O3, na.rm = FALSE, dims=1)
avg3 <- colMeans(chem$FeO, na.rm = FALSE, dims=1)
avg4 <- colMeans(chem$MgO, na.rm = FALSE, dims=1)
avg5 <- colMeans(chem$CaO, na.rm = FALSE, dims=1)
avg6 <- colMeans(chem$Na2O, na.rm = FALSE, dims=1)
avg7 <- colMeans(chem$K2O, na.rm = FALSE, dims=1)

write <- write.table(sd1,sd2,sd3,sd4,sd5,sd6,sd7, file="./test.csv", append=TRUE, sep=",", dec=".", col.names = c("eSiO2", "eAl2O3", "eFeO", "eMgO", "eCaO", "eNa2O", "eK2O"))

write <- write.table(avg1, avg2, avg3, avg4, avg5, avg6, avg7, file="./testAVG.csv", append=FALSE, sep=",", dec=".", col.names = c("Sample", "SiO2", "Al2O3", "FeO", "MgO", "CaO", "Na2O", "K2O", "Total"))

The data I'm working with is this

Sample, SiO2, Al2O3, FeO, MgO, CaO, Na2O, K2O, Total,eSiO2,eAl2O3,eFeO,eMgO,eCaO,eNa2O,eK2O
01,65.01,14.77,0.34,1.31,17.27,1.14,0.2,100,,,,,,,
02,72.6,16.27,0.53,0.06,1.27,5.55,3.71,100,,,,,,,
03,64.95,14.65,0.18,1.29,17.48,1.21,0.23,100,,,,,,,
04,64.95,14.65,0.18,1.29,17.48,1.21,0.23,100,,,,,,,

I get this error:

Error in colMeans(chem$SiO2, na.rm = FALSE, dims = 1) : 
  'x' must be an array of at least two dimensions

Any advice? Thanks

The comments already hint at how to do it, but it seems that you are rather new to R , so let me explicitly show you how you could do it better, using the mtcars dataset:

df <- mtcars

df_sd <- apply(df, 2, sd) # this is how to use apply. See ?apply
df_avg <- colMeans(df)    # this is how to use colMeans. See ?colMeans

write.table(df_sd, file="test.csv")     # no assignment necessary.
write.table(df_avg, file="testAVG.csv") # writing the file is a desired side effect...

Moreover, please consider the following line:

avg1 <- colMeans(chem$SiO2, na.rm = FALSE, dims=1)

The cool thing about colMeans is that it computes the columnwise means for many columns at once. Here, you are supplying only one vector, namely chem$SiO2 . If this is really what you want to do, you would just write

avg1 <- mean(chem$SiO2)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM