简体   繁体   English

处理data.frames

[英]Manipulating data.frames

I have a sample survey sheet; 我有一个样本调查表; something like demographic. 像人口统计的东西。 One of the columns is country (factor) another is annual income . 列之一是country (factor)另一列是annual income Now, I need to calculate average of each country and store in new data.frame with country and corresponding mean . 现在,我需要计算每个国家/地区的平均值,然后将data.framecountry和相应的均值一起存储在新的data.frame It should be simple but I am lost. 它应该很简单,但我迷路了。 The data is something like the one shown below: 数据如下图所示:

Country  Income($) Education ... ... ...
1. USA    90000      Phd
2. UK     94000      Undergrad
3. USA    94000      Highschool
4. UK     87000      Phd
5. Russia 77000      Undergrad
6. Norway 60000      Masters
7. Korea  90000      Phd
8. USA    110000     Masters
.
.

I need a final result like: 我需要像这样的最终结果:

USA   UK    Russia ...
98000 90000 75000

Thank You. 谢谢。

data example: 数据示例:

dat <- read.table(text="Country  Income Education 
 USA    90000      Phd
 UK     94000      Undergrad
 USA    94000      Highschool
 UK     87000      Phd
 Russia 77000      Undergrad
 Norway 60000      Masters
 Korea  90000      Phd
 USA    110000     Masters",header=TRUE)

Do what you want with plyr : plyr做你想做的plyr

if your data is called dat : 如果您的数据称为dat

library(plyr)
newdf <- ddply(dat, .(Country), function(x) Countrymean = mean(x$Income))

# newdf <- ddply(dat, .(Country), function(x) data.frame(Income = mean(x$Income)))

and aggregate: 并汇总:

 newdf <- aggregate(Income ~ Country, data = dat, FUN = mean)

for the output you show at the end maybe tapply ? 对于您最后显示的输出也许是tapply

tapply(dat$Income, dat$Country, mean)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM