简体   繁体   中英

How to choose a range of columns in R

I have some data and just want to calculate mean , sd , var and so on. My problem are not the functions but the columns, I just can't seem to figure out how to choose them.

So the first column includes the names of the animals and column 2 to 11 my numeric data. Column names are X1 to X10 . I have lots of NA in my data.

I can easily calculate it for each row but when I combine them I always get

Argument is not numeric or logical: returning NA

So for example for mean and one column I tried (+ it worked)

mean(WLD1$X1, na.rm=TRUE)

for column 2 to 11 I tried:

mean(WLD1[,c(2:11)], na.rm=TRUE)

also tried:

lapply(WLD1[,2:11], mean, na.rm=TRUE)

Also tried it with X1:X10 .
I guess it's pretty simple but I'm just stuck on it. Really thankful for any help.

You may want to use apply function. What the apply function does is takes a function (required computation) and applies to each element either column wise or row wise for a DataFrame or a matrix. The row wise and column wise settings are encoded by the MARGIN= parameter and the actual computation that you want to do is encoded by FUN= (which stands for function obviously). So if you want to feed your one row at a time inside the intended function/computation then you will choose MARGIN=1 otherwise you will choose MARGIN=2 (which means one column at a time will be fed into the function). Since in your case you want to compute the mean, sd and var for column numbers 2 to 11, you will do it in three steps and you are right we will have MARGIN=2 for all the three statments but FUN= will keep changing. Below is the code.

Mean_of_2_to_11_Column <- apply(WLD1[,2:11], MARGIN=2, FUN=mean)
SD_of_2_to_11_Column <- apply(WLD1[,2:11], MARGIN=2, FUN=sd)
Var_of_2_to_11_Column <- apply(WLD1[,2:11], MARGIN=2, FUN=var)

Let me know if any thing here I said is not clear to you. All the best

You may use the purrr package.

library(purrr)
mydatabase %>% map_if(is.numeric, function(x) mean(x, na.rm = TRUE))

This will calculate the mean of all the numeric columns of your database while ignoring NA values.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM