[英]Manipulating data.frames
I have a sample survey sheet; 我有一个样本调查表; something like demographic.
像人口统计的东西。 One of the columns is
country (factor)
another is annual income
. 列之一是
country (factor)
另一列是annual income
。 Now, I need to calculate average of each country and store in new data.frame
with country
and corresponding mean . 现在,我需要计算每个国家/地区的平均值,然后将
data.frame
与country
和相应的均值一起存储在新的data.frame
。 It should be simple but I am lost. 它应该很简单,但我迷路了。 The data is something like the one shown below:
数据如下图所示:
Country Income($) Education ... ... ...
1. USA 90000 Phd
2. UK 94000 Undergrad
3. USA 94000 Highschool
4. UK 87000 Phd
5. Russia 77000 Undergrad
6. Norway 60000 Masters
7. Korea 90000 Phd
8. USA 110000 Masters
.
.
I need a final result like: 我需要像这样的最终结果:
USA UK Russia ...
98000 90000 75000
Thank You. 谢谢。
data example: 数据示例:
dat <- read.table(text="Country Income Education
USA 90000 Phd
UK 94000 Undergrad
USA 94000 Highschool
UK 87000 Phd
Russia 77000 Undergrad
Norway 60000 Masters
Korea 90000 Phd
USA 110000 Masters",header=TRUE)
Do what you want with plyr
: 用
plyr
做你想做的plyr
:
if your data is called dat
: 如果您的数据称为
dat
:
library(plyr)
newdf <- ddply(dat, .(Country), function(x) Countrymean = mean(x$Income))
# newdf <- ddply(dat, .(Country), function(x) data.frame(Income = mean(x$Income)))
and aggregate: 并汇总:
newdf <- aggregate(Income ~ Country, data = dat, FUN = mean)
for the output you show at the end maybe tapply
? 对于您最后显示的输出也许是
tapply
?
tapply(dat$Income, dat$Country, mean)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.