简体   繁体   English

获取另一列值的平均值

[英]Get the average of the values of one column for the values in another

I was not so sure how to ask this question.我不太确定如何问这个问题。 i am trying to answer what is the average tone when an initiative is mentioned and additionally when a topic, and a goal( or achievement) are mentioned.我试图回答提到倡议时以及提到主题和目标(或成就)时的平均语气是什么。 My dataframe (df) has many mentions of 70 initiatives (rows).我的数据框 (df) 多次提及 70 项举措(行)。 meaning my df has 500+ rows of data, but only 70 Initiatives.这意味着我的 df 有 500 多行数据,但只有 70 个倡议。

My data looks like this我的数据看起来像这样

> tabmean
    Initiative Topic Goals Achievements Tone
1           52    44     2            2    2
2          294    42     2            2    2
3          103    31     2            2    2
4           52    41     2            2    2
5           87    26     2            1    1
6           52    87     2            2    2
7          136    81     2            2    2
8           19     7     2            2    1
9           19     4     2            2    2
10           0    63     2            2    2
11           0    25     2            2    2
12          19    51     2            2    2
13          52    51     2            2    2
14         108    94     2            2    1
15          52    89     2            2    2
16         110    37     2            2    2
17         247    25     2            2    2
18          66    95     2            2    2
19          24    49     2            2    2
20          24   110     2            2    2 

I want to find what is the mean or average Tone when an Initiative is mentioned.当提到倡议时,我想找出平均或平均语气是什么。 as well as what is the Tone when an Initiative, a Topic and a Goal are mentioned at the same time.以及同时提到倡议、主题和目标时的语气是什么。 The code options for Tone are : positive(coded: 1), neutral(2), negative (coded:3), and both positive and negative(4). Tone 的代码选项有:正(编码:1)、中性(2)、负(编码:3)以及正负(4)。 Goals and Achievements are coded yes(1) and no(2).目标和成就被编码为 yes(1) 和 no(2)。

I have used this code:我用过这段代码:

GoalMeanTone <- tabmean %>%
  group_by(Initiative,Topic,Goals,Tone) %>%
  summarize(averagetone = mean(Tone))

With Solution output :使用解决方案输出:

GoalMeanTone 
# A tibble: 454 x 5
# Groups:   Initiative, Topic, Goals [424]
   Initiative Topic Goals Tone  averagetone
   <chr>      <chr> <chr> <chr>       <dbl>
 1 0          104   2     0              NA
 2 0          105   2     0              NA
 3 0          22    2     0              NA
 4 0          25    2     0              NA
 5 0          29    2     0              NA
 6 0          30    2     1              NA
 7 0          31    1     1              NA
 8 0          42    1     0              NA
 9 0          44    2     0              NA
10 0          44    NA    0              NA
# ... with 444 more rows

note that for Initiative Value 0 means "other initiative".请注意,倡议值 0 表示“其他倡议”。

and I've also tried this code我也试过这段代码

library(plyr)
GoalMeanTone2 <- ddply( tabmean, .(Initiative), function(x) mean(tabmean$Tone) )

with solution output有解决方案输出

> GoalMeanTone2
   Initiative V1
1           0 NA
2           1 NA
3         101 NA
4         102 NA
5         103 NA
6         104 NA
7         105 NA
8         107 NA
9         108 NA
10        110 NA

Note that in both instances, I do not get an average for Tone but instead get NA's请注意,在这两种情况下,我都没有得到 Tone 的平均值,而是得到了 NA

I have removed the NAs in the df from the column "Tone" also have tried to remove all the other mission values in the df ( its only about 30 values that i deleted).我已经从“音调”列中删除了 df 中的 NA,还尝试删除了 df 中的所有其他任务值(我只删除了大约 30 个值)。 and I have also re-coded the values for Tone :我还重新编码了 Tone 的值:

tabmean<-Meantable %>% mutate(Tone=recode(Tone, 
                                            `1`="1",
                                            `2`="0",
                                            `3`="-1",
                                            `4`="2"))

I still cannot manage to get the average tone for an initiative.我仍然无法为一项倡议找到平均基调。 Maybe the solution is more obvious than i think, but have gotten stuck and have no idea how to proceed or solve this.也许解决方案比我想象的更明显,但已经陷入困境并且不知道如何继续或解决这个问题。

i'd be super grateful for a better code to get this.我非常感谢有更好的代码来获得这个。 Thanks!谢谢!

I'm not completely sure what you mean by 'the average tone when an initiative is mentioned', but let's say that you'd want to get the average tone for when initiative=1 , you could try the following:我不完全确定“提到主动性时的平均语气”是什么意思,但是假设您想要在initiative=1时获得平均tone ,您可以尝试以下操作:

tabmean %>% filter(initiative==1) %>% summarise(avg_tone=mean(tone, na.rm=TRUE)

Note that (1) you have to add na.rm==TRUE to the summarise call if you have missing values in the column that you are summarizing, otherwise it will only produce NA's, and (2) check that the columns are of type numeric (you could check that with str(tabmean) and for example change tone to numeric with tabmean <- tabmean %>% mutate(tone=as.numeric(tone) ).请注意,(1)如果您正在汇总的列中缺少值,则必须将na.rm==TRUE添加到summarise调用中,否则它只会产生 NA,并且(2)检查列的类型数字(您可以使用str(tabmean)进行检查,例如使用tabmean <- tabmean %>% mutate(tone=as.numeric(tone)将音调更改为数字)。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM