简体   繁体   English

创建一个汇总两年观察值的新变量

[英]Creating a new variable that aggregates two years of observations

I would like to plot some data in set with the frequency of x over time y which is in years. 我想以年为单位的时间y上的x频率绘制一组数据。 I've been able to manipulate the data into a data frame where I have the frequency of certain binary string data. 我已经能够将数据操纵到一个数据帧中,在该数据帧中,我具有某些二进制字符串数据的频率。 As it currently is have I have the frequency by year with two lines per year in order to plot the frequency of the different binary outcomes. 因为目前有频率,所以每年有两条线,以绘制不同二进制结果的频率。 However, I would like to plot the percentage of the total of these observations by year. 但是,我想按年份绘制这些观测结果总数的百分比。

df <- data.frame( x = c("1980", "1980", "1981", "1981", "1982", "1982" ),
             y = c("yes", "no", "yes", "no", "yes", "no"),
             z = c("26", "18", "32", "12", "18", "16"))

Initially I tried this code by aggregating the observations by year but it only has 32 rows of data when I need to have 64. 最初,我通过按年份汇总观察值来尝试此代码,但是当我需要64行时,它只有32行数据。

df1$Sum <- aggregate(df1$z, by=list(df1$x), FUN=sum)

Is there someway I can duplicate the observations by year so that in a new column is contains the sums of both "yes" and "no" in 1980 for both rows 1 and 2? 我是否可以以某种方式按年份复制观测值,以便在新列中包含1980年第一行和第二行的“是”和“否”之和?

library(data.table)
dt = data.table(your_df)

dt[, z.sum := sum(z), by = x]

Assuming your column z is actually numbers, not really the case in OP, but I assume that's a typo. 假设您的z列实际上是数字,在OP中并非如此,但是我认为那是一个错字。

If your goal is to "plot the percentage of the total of these observations by year", I assume you don't have to go via sums. 如果您的目标是“按年份绘制这些观测值的百分比”,那么我认为您不必进行求和。

Here is one possibility to get percentages per year: 这是一种每年获取百分比的可能性:

library(plyr)
df <- data.frame( x = c("1980", "1980", "1981", "1981", "1982", "1982" ),
                  y = c("yes", "no", "yes", "no", "yes", "no"),
                  z = c("26", "18", "32", "12", "18", "16"))
df$z <- as.numeric(as.character(df$z))

df2 <- ddply(.data = df, .variables = .(x), mutate,
             prop = z/sum(z))
df2

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM