简体   繁体   中英

How to divide a data frame into new data frames(like new data1,data2,data3 ..so on), so I can anaylsis each of them(like T-test)

I am just start learning R for data analysis. Here is my problem.

在此处输入图片说明

I want to analyse the body weight(BW) difference between male and female in different species. (For example, in Sorex gracilliums, male and female body weight is significantly different just an example,I don't know the answer. :))At first I thought maybe I can first divide them by Species into several groups.(This indeed can be done in Excel, but I have tooo many files, I think maybe R is better ) And then I can just using some simple code to test sex difference. But I don't know how to divide them, how to make new data frame.. I tried to use group_split. It indeed split the data, but just many tribble. like image showed

在此处输入图片说明

What should I do? Or maybe there is a better way for testing the difference?

I am a foreigner,so maybe there are many grammar mistakes.. But I will be very appreciated if you help!

Assuming your data is in a data.frame called df, with columns NO, SPECIES, SEX, BW:

set.seed(100)
df = data.frame(NO=1:100,
SPECIES=sample(LETTERS[1:4],100,replace=TRUE),
SEX=sample(c("M","F"),100,replace=TRUE),
BW = rnorm(100,80,2)
)

And we make Species D to have an effect:

df$BW[df$SPECIES=="D" & df$SEX=="M"] = df$BW[df$SPECIES=="D" & df$SEX=="M"] + 5

If we want to do it on one data frame, say Species A, we do

dat = subset(df,SPECIES=="A")
t.test(BW ~ SEX,data=dat)

And you get the relevant statistics and so forth. To do this systematically for all SPECIES, we can use broom, dplyr:

library(dplyr)
library(broom)

df %>% group_by(SPECIES) %>% do(tidy(t.test(BW ~ SEX,data=.)))

# A tibble: 4 x 11
# Groups:   SPECIES [4]
  SPECIES estimate estimate1 estimate2 statistic p.value parameter conf.low
  <fct>      <dbl>     <dbl>     <dbl>     <dbl>   <dbl>     <dbl>    <dbl>
1 A          0.883      80.4      79.6     0.936 3.65e-1      14.2   -1.14 
2 B          0.259      80.2      79.9     0.377 7.12e-1      14.1   -1.21 
3 C          0.170      80.1      79.9     0.359 7.23e-1      25.3   -0.807
4 D         -5.55       79.7      85.2    -7.71  1.29e-7      21.4   -7.05 

If you don't want to install any packages, this will give you all the test results:

by(df, df$SPECIES, function(x)t.test(BW ~ SEX,data=x))

And combining them into one data.frame:

func = function(x){ 
Nu=t.test(BW ~ SEX,data=x);
data.frame(estimate_1=Nu$estimate[1],estimate_2=Nu$estimate[2],p=Nu$p.value)} 
do.call(rbind,by(df, df$SPECIES,func)) 

Here is an example to set multiple data.frames from one. The exemple data set iris is a table of character for 3 species.

First you can set a vector with all the species in your dataframe nspe . I then create a liste of the same length.

The for loop allows to watch each element of this list et put it a data.frame with just the species.

At the end of this script, I compute the mean petal width of the setosa species. If I had two discrete character on this species, I could do a t.test as well. I did one here but it's not really usefull...

data("iris")
summary(iris)

nspe <- as.vector(unique(iris$Species))

spe <- list() ; length(spe) = length(nspe) ; names(spe) <- nspe

for(i in nspe){
  spe[i][[1]] <- iris[which(iris$Species == i),]
}

mean(spe$setosa$Petal.Width)
# [1] 0.246
t.test(spe$setosa$Petal.Width)

Below is an example to show how you can run a t.test on one species. Note that you will surely have trouble with species names and spaces, so I think it's easier to set ID for species than keeping their full names.

In future questions, consider providing a small example dataset rather than pictures, it's easier to help you.

# NOT RUN
t.test(
  spe$Sorex_gracilliums$BW[which(spe$Sorex_gracilliums$SEX == 'm')],
  spe$Sorex_gracilliums$BW[which(spe$Sorex_gracilliums$SEX == 'f')]
)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM