简体   繁体   English

标准化行中的名称并根据 R 中的相似行计算几何平均值

[英]Standardize name in row and calculate the geometric mean based on similar row in R

I have a data table where I want to standardize the name in "Sex" and calculate the geometric mean based on each Group (as in x, y and z in the table).我有一个数据表,我想在其中标准化“性别”中的名称并根据每个组计算几何平均值(如表中的 x、y 和 z)。

Would appreciate your help.感谢您的帮助。 Below are the data.table.以下是 data.table。

library(data.table)
dt <- data.table(Group = c("x","x","x","y","z","z"), Sex = c("Man","Female","Feminine","Male","M","F"), Score = c(0,0.4,0.1,0.5,3,2.1))

Thank you.谢谢你。

Is this what you want?这是你想要的吗?

geomean <- function(v) prod(v)**(1/length(v))
res <- tapply(dt$Score, dt$Group, geomean)

which gives这使

> res
      x       y       z 
0.00000 0.50000 2.50998 

or use ave to create a new column或使用ave创建一个新列

dt <- within(dt,gm <- ave(Score,Group,FUN = geomean))
> dt
Group      Sex Score      gm
1:     x      Man   0.0 0.00000
2:     x   Female   0.4 0.00000
3:     x Feminine   0.1 0.00000
4:     y     Male   0.5 0.50000
5:     z        M   3.0 2.50998
6:     z        F   2.1 2.50998

EDIT :编辑

If you want to group data by both Group and Sex , try below如果您想同时按GroupSex对数据进行分组,请尝试以下操作

dt <- within(transform(dt,Sex = toupper(substr(Sex,1,1))),
             gm <- ave(Score,Group,Sex,FUN = geomean))

thus因此

> dt
   Group Sex Score  gm
1:     x   M   0.0 0.0
2:     x   F   0.4 0.2
3:     x   F   0.1 0.2
4:     y   M   0.5 0.5
5:     z   M   3.0 3.0
6:     z   F   2.1 2.1

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM