简体   繁体   中英

Conditions in R-Studio

I have this problem statement, I am having this as an example:

Product_id  product_type    views   inventory
1   producttype1    Y   Y
2   producttype2    N   N
3   producttype3    Y   Y
4   producttype4    N   N
5   producttype5    Y   Y
6   producttype6    N   N
7   producttype7    Y   Y
8   producttype1    N   N
9   producttype2    Y   Y
10  producttype3    N   N
11  producttype4    Y   Y
12  producttype5    N   N
13  producttype6    Y   Y
14  producttype7    N   N
15  producttype7    Y   Y

I have 10 millions as population from where I am trying to extract a 10% sample of population and I have to group them by product_type, views. But in the end when I get the sample, if the sample it is less than 500k then I can keep it as it is but in the scenario when the sample is highest than 500k I have to reduce the sample at 500k. This is the code that I wrote to group and to extract the 10% sample:

MPSSAMPLE %>% 
  group_by(product_type, views) %>%
  sample_frac(.10) -> sampledData

Can anyone help me with the conditions?

You can use min to get minimum between 500k or 10% of the sample population of the group.

library(dplyr)

n <- 500000

MPSSAMPLE %>% 
  group_by(product_type, views) %>%
  sample_n(min(n() * 0.1, n)) -> sampledData

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM