简体   繁体   中英

Random row selection in R

I have this dataframe

id <- c(1,1,1,2,2,3)
name <- c("A","A","A","B","B","C")
value <- c(7:12)
df<- data.frame(id=id, name=name, value=value)
df

This function selects a random row from it:

randomRows = function(df,n){
  return(df[sample(nrow(df),n),])
}

ie

randomRows(df,1)

But I want to randomly select one row per 'name' (or per 'id' which is the same) and concatenate that entire row into a new table, so in this case, three rows. This has to loop throught a 2000+ rows dataframe. Please show me how?!

I think you can do this with the plyr package:

library("plyr")
ddply(df,.(name),randomRows,1)

which gives you for example:

  id name value
1  1    A     8
2  2    B    11
3  3    C    12

Is this what you are looking for?

Here's one way of doing it in base R.

> df.split <- split(df, df$name)
> df.sample <- lapply(df.split, randomRows, 1)
> df.final <- do.call("rbind", df.sample)
> df.final
  id name value
A  1    A     7
B  2    B    11
C  3    C    12

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM