[英]Selecting rows of a data frame in R
Consider the following data set.考虑以下数据集。
id var1 var2
1 A 33
2 B 23
3 A 45
4 A 55
5 B 22
6 A 33
7 B 90
8 A 78
9 B 12
10 A 11
I want to take a random sample (rows) of size 5
and 4
from A
and B
category, respectively.我想分别从
A
和B
类别中抽取大小为5
和4
的随机样本(行)。 Can one help me, please?有人可以帮帮我吗? Thanks!
谢谢!
You can use sample
:您可以使用
sample
:
sample_1 <- sample(df[df$var1 == "A", ]$var2, 5)
sample_2 <- sample(df[df$var1 == "B", ]$var2, 4)
Use replace=TRUE
for sampling with replacement.使用
replace=TRUE
进行替换采样。
Data数据
df <- read.table(text="id var1 var2
1 A 33
2 B 23
3 A 45
4 A 55
5 B 22
6 A 33
7 B 90
8 A 78
9 B 12
10 A 11", header=TRUE)
An option is to split
the dataset and use sample
in Map
一个选项是
split
数据集并使用Map
中的sample
do.call(rbind, Map(function(dat, y)
dat[sample(seq_len(nrow(dat)), size = y),], split(df, df$var1), c(5, 4)))
df <- structure(list(id = 1:10, var1 = c("A", "B", "A", "A", "B", "A",
"B", "A", "B", "A"), var2 = c(33L, 23L, 45L, 55L, 22L, 33L, 90L,
78L, 12L, 11L)), class = "data.frame", row.names = c(NA, -10L
))
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.