简体   繁体   中英

What is the equivalent of survey::svymean(~interaction()) using the srvyr package?

I need some help analyzing survey data.

Here is my code. Data prep

library(survey)
library(srvyr)
data(api)

dclus2 <- apiclus1 %>%
  as_survey_design(dnum, weights = pw, fpc = fpc)

These two codes give me the same result.

One using the package survey

#Code
survey::svymean(~awards, dclus2)

#Results
             mean    SE
awardsNo  0.28962 0.033
awardsYes 0.71038 0.033

One using the package srvyr

#Code
srvyr::dclus2%>%
       group_by(awards)%>%
       summarise(m=survey_mean())

#Results
awards    m            m_se
No     0.2896175    0.0330183       
Yes    0.7103825    0.0330183

I would like to get the survey mean of by the variable "awards" subset by the variable "stype" with levels No and Yes.

In the survey package, interaction is used eg. svymean(~interaction(awards,stype), dclus2) How do I get the same result using the srvyr package?

Thank you for your help

How do get the result below using the package srvyr?

#Code
svymean(~interaction(awards,stype), dclus2)

#Results
                                    mean     SE
interaction(awards, stype)No.E  0.180328 0.0250
interaction(awards, stype)Yes.E 0.606557 0.0428
interaction(awards, stype)No.H  0.043716 0.0179
interaction(awards, stype)Yes.H 0.032787 0.0168
interaction(awards, stype)No.M  0.065574 0.0230
interaction(awards, stype)Yes.M 0.071038 0.0203

You can simply imitate the recommended behavior for survey : create a new variable formed by concatenating distinct values of each of the component variables. That's all that the interaction() function is doing for svymean() .

library(survey)
library(srvyr)

data(api)

# Set up design object
dclus2 <- apiclus1 %>%
  as_survey_design(dnum, weights = pw, fpc = fpc)

# Create 'interaction' variable
dclus2 %>%
  mutate(awards_stype = paste(awards, stype, sep = " - ")) %>%
  group_by(awards_stype) %>%
  summarize(
    prop = survey_mean()
  )
#> # A tibble: 6 x 3
#>   awards_stype   prop prop_se
#>   <chr>         <dbl>   <dbl>
#> 1 No - E       0.180   0.0250
#> 2 No - H       0.0437  0.0179
#> 3 No - M       0.0656  0.0230
#> 4 Yes - E      0.607   0.0428
#> 5 Yes - H      0.0328  0.0168
#> 6 Yes - M      0.0710  0.0203

To get the various component variables split back into separate columns, you can use the separate() function from the tidyr package.

# Separate the columns afterwards
dclus2 %>%
  mutate(awards_stype = paste(awards, stype, sep = " - ")) %>%
  group_by(awards_stype) %>%
  summarize(
    prop = survey_mean()
  ) %>%
  tidyr::separate(col = "awards_stype",
                  into = c("awards", "stype"),
                  sep = " - ")
#> # A tibble: 6 x 4
#>   awards stype   prop prop_se
#>   <chr>  <chr>  <dbl>   <dbl>
#> 1 No     E     0.180   0.0250
#> 2 No     H     0.0437  0.0179
#> 3 No     M     0.0656  0.0230
#> 4 Yes    E     0.607   0.0428
#> 5 Yes    H     0.0328  0.0168
#> 6 Yes    M     0.0710  0.0203

Created on 2021-03-30 by the reprex package (v1.0.0)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM