简体   繁体   English

如何将计数总列添加到 R 中的数据框

[英]How to add count total column to dataframe in R

I have aggregated a dataframe by the number of times we observe a single word appears in a dataset for meetings on a particular date so it looks like this:我根据我们观察到的单个单词出现在特定日期的会议数据集中的次数聚合了一个数据框,因此它看起来像这样:

date <- c("2012-05-06", "2013-07-09", "2007-01-03")
word_count <- c("17", "2", "390")
df1 <- data.frame(date, word_count)

I also have a separate dataframe with total word counts for every date and then a series of other dates as well.我还有一个单独的数据框,其中包含每个日期的总字数以及一系列其他日期。 It looks like this:它看起来像这样:

date <- c("2012-05-06", "2013-07-09", "2007-01-03", "2004-11-03", "1994-12-03")
word_total <- c("17000", "20", "39037", "39558", "58607")
df2 <- data.frame(date, word_count)

I now want to add another column to df1 that incorporates the totals for the dates that are in df2 but excludes data for any dates that are not in df1 .我现在想添加另一列到df1并入对于那些在日期总计df2但不在任何日期,不包含数据df1 I also want to transform the dataframe so that there is another column dividing word_total by word_count.我还想转换数据框,以便有另一列将 word_total 除以 word_count。

So the output would look like this:所以输出看起来像这样:

date <- c("2012-05-06", "2013-07-09", "2007-01-03")
word_count <- c("17", "2", "390")
word_total <- c("17000", "20", "39037")
word_percentage <- c("0.001", "0.1", "0.00999")
df2 <- data.frame(date, word_count, word_total, word_percentage)`

I know how to use transform to get word_percentage once I have word_total loaded in but I have no idea how to add in relevant column data from word_total .我知道如何使用变换得到word_percentage一旦我word_total在加载,但我不知道如何在相关的列数据从增加word_total I have tried using merge and intersect to no avail.我试过使用合并和相交无济于事。 Any ideas?有任何想法吗?

Thank you in advance for your help!预先感谢您的帮助!

If the columns are numeric, then just do a merge and then create the column by dividing如果列是数字,则只需进行merge ,然后通过除法创建列

transform(merge(df1, df2, by = c('date')),
        word_percentage = round(word_count/word_total, 3))

Or use match或者使用match

df1$word_percentage <- df1$word_count/df2$word_total[match(df1$date, df2$date)]

data数据

df1$word_count <- as.integer(as.character(df1$word_count))
df2$word_total <- as.integer(as.character(df2$word_total))

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM