[英]Reformat categorical data in R
I have a categorical dataset that I am trying to summarize that has inherent differences in the nature of questions that were asked. 我正在尝试归类一个分类数据集,该数据集在所问问题的性质方面具有内在差异。 The data below represent a questionnaire that had standard close-ended questions, but also questions where one could choose multiple answers from a list.
下面的数据代表了一个问卷调查表,其中包含标准的封闭式问题,也包含可以从列表中选择多个答案的问题。 "village" and "income" represent close-ended questions.
“村庄”和“收入”代表了封闭的问题。 "responsible.1"...etc... represent a list where the respondent either said yes or no to each.
“ responsible.1”等表示被访者对每个人说是或否的列表。
VILLAGE INCOME responsible.1 responsible.2 responsible.3 responsible.4 responsible.5
j both DLNR NA DEQ NA Public
k regular.income DLNR NA NA NA NA
k regular.income DLNR CRM DEQ Mayor NA
l both DLNR NA NA Mayor NA
j both DLNR CRM NA Mayor NA
m regular.income DLNR NA NA NA Public
What I want is a 3-way table output with "village" and the suite of of "responsible" responsible variables wrapped up into a ftable
. 我想要的是一个三向表输出,其中带有“ village”和一组“负责”的负责任变量,这些变量包装在一个
ftable
。 This way, I could use the table with numerous R packages for graphs and analyses. 这样,我可以将带有多个R包的表用于图形和分析。
RESPONSIBLE
VILLAGE INCOME responsible.1 responsible.2 responsible.3 responsible.4 responsible.5
j both 2 1 1 1 1
k regular income 2 1 1 1 0
l both 1 0 0 1 0
m regular income 1 0 0 0 1
as.data.frame(table(village, responsible.1)
would get me the first, but I can't figure out how to get the entire thing wrapped up in a nice ftable
. as.data.frame(table(village, responsible.1)
会让我第一个,但是我不知道如何将整个东西包装在一个好的ftable
。
> aggregate(dat[-(1:2)], dat[1:2], function(x) sum(!is.na(x)) )
VILLAGE INCOME responsible.1 responsible.2 responsible.3 responsible.4 responsible.5
1 j both 2 1 1 1 1
2 l both 1 0 0 1 0
3 k regular.income 2 1 1 1 0
4 m regular.income 1 0 0 0 1
I'm guessing you actually had another grouping vector , perhaps the first "responsible" column? 我猜您实际上还有另一个分组向量,也许是第一个“负责任”列?
I don't really understand the sorting rules but reversing the order of the grouping columns may be closer to what you posted: 我不太了解排序规则,但是颠倒分组列的顺序可能更接近于您发布的内容:
> aggregate(dat[-(1:2)], dat[2:1], function(x) sum(!is.na(x)) )
INCOME VILLAGE responsible.1 responsible.2 responsible.3 responsible.4 responsible.5
1 both j 2 1 1 1 1
2 regular.income k 2 1 1 1 0
3 both l 1 0 0 1 0
4 regular.income m 1 0 0 0 1
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.