[英]Frequency tables with weighted data in R
我需要按年齡和婚姻狀況計算個人的頻率,所以通常我會使用:
table(age, marital_status)
然而,每個人在數據采樣后都有不同的權重。 我如何將其合並到我的頻率表中?
您可以使用包survey
函數svytable
或wtd.table
的rgrs
。
編輯: rgrs
現在被稱為questionr
:
df <- data.frame(var = c("A", "A", "B", "B"), wt = c(30, 10, 20, 40))
library(questionr)
wtd.table(x = df$var, weights = df$wt)
# A B
# 40 60
這也可以使用dplyr
:
library(dplyr)
count(x = df, var, wt = wt)
# # A tibble: 2 x 2
# var n
# <fctr> <dbl>
# 1 A 40
# 2 B 60
只是為了完整起見,使用基礎 R:
df <- data.frame(var = c("A", "A", "B", "B"), wt = c(30, 10, 20, 40))
aggregate(x = list("wt" = df$wt), by = list("var" = df$var), FUN = sum)
var wt
1 A 40
2 B 60
或者使用不那么麻煩的公式符號:
aggregate(wt ~ var, data = df, FUN = sum)
var wt
1 A 40
2 B 60
來自包expss
另一個解決方案:
df <- data.frame(var = c("A", "A", "B", "B"), wt = c(30, 10, 20, 40))
library(expss)
fre(df$var, weight = df$wt)
| df$var | Count | Valid percent | Percent | Responses, % | Cumulative responses, % |
| ------ | ----- | ------------- | ------- | ------------ | ----------------------- |
| A | 40 | 40 | 40 | 40 | 40 |
| B | 60 | 60 | 60 | 60 | 100 |
| #Total | 100 | 100 | 100 | 100 | |
| <NA> | 0 | | 0 | | |
使用data.table
你可以這樣做:
# using the same data as Victorp
setDT(df)[, .(n = sum(wt)), var]
var n
1: A 40
2: B 60
您還可以使用包 freqweights 中的 tablefreq:
df <- data.frame(var = c("A", "A", "B", "B"), wt = c(30, 10, 20, 40))
library(freqweights)
tablefreq(df, "var", "wt")
A tibble: 2 x 2
var freq
<fct> <dbl>
1 A 40
2 B 60
使用包權重和函數 wpct
require(weights)
df <- data.frame(var = c("A", "A", "B", "B"), wt = c(30, 10, 20, 40))
wpct(df$var, df$wt)
A B
0.4 0.6
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.