![](/img/trans.png)
[英]Create New Data Frame with Column Names from Unique Values in another Data Frame and Corresponding Values Assigned to Column
[英]R: How to create a new column in a data frame with “countif” values from another data frame?
我有一个数据帧(df1),如下所示。 它表示公司活跃于特定市场的年份。
Company Country Year
A Austria 2010
A Germany 2010
A Austria 2011
B Italy 2010
现在,我有第二个数据帧(df2),如下所示。 它按投资类型列出给定时间公司在某个国家的所有投资,作为虚拟变量。
Company Country Year JointVenture M&A Greenfield
A Austria 2010 1 0 0
A Austria 2010 0 1 0
A Austria 2010 1 0 0
...
现在我的问题如下:我想在df1中添加一个新列 ,包括df2中指示的每种投资类型的“ countif” 。 例如,新的df1:
Company Country Year Count.JointVenture Count.M&A Count.Greenfield
A Austria 2010 2 1 0
A Germany 2010 ...........
A Austria 2011
B Italy 2010
另外,如何将新的列添加到df1,将这些计数转换为虚拟变量(如果> 0,则为1;如果0,则为0)?
感谢和抱歉,这是一个基本问题,但我在现有线程中找不到合适的解决方案。
马丁,干杯
使用aggregate()和ifelse()函数
# test data
df <- data.frame(Company = rep("A", 3),
Country = rep("Austria", 3),
Year = rep(2010, 3),
JointVenture = c(1,0,1),
MnA = c(0,1,0),
Greenfield = rep(0,3))
# this is the new df
counts <- aggregate(cbind(JointVenture, MnA, Greenfield)~Country+Company+Year, data = df, FUN = sum)
# dummy
counts$dummyJointVenture <- ifelse(counts$JointVenture > 0, 1, 0)
counts$dummyMnA <- ifelse(counts$MnA > 0, 1, 0)
counts$dummyGreenfield <- ifelse(counts$Greenfield > 0, 1, 0)
我把data.table
尝试扔进了竞技场:
df <- fread("Company Country Year
A Austria 2010
A Germany 2010
A Austria 2011
B Italy 2010")
df2 <- fread("Company Country Year JointVenture M&A Greenfield
A Austria 2010 1 0 0
A Austria 2010 0 1 0
A Austria 2010 1 0 0")
setkey(df2, Company, Country, Year)
df2[,c("JointVenture", "M&A", "Greenfield") := .(sum(JointVenture), sum(`M&A`), sum(Greenfield)), by=.(Company, Country, Year)]
merge(x=df, y=unique(df2), by=c("Company", "Country", "Year"), all.x=T, all.y=F, suffixes = c("", "Count."))
导致
Company Country Year JointVenture M&A Greenfield
1: A Austria 2010 2 1 0
2: A Austria 2011 NA NA NA
3: A Germany 2010 NA NA NA
4: B Italy 2010 NA NA NA
使用dplyr::summarise_each
并与Martin的数据merge
。
df <- fread("Company Country Year
A Austria 2010
A Germany 2010
A Austria 2011
B Italy 2010")
df2 <- fread("Company Country Year JointVenture MA Greenfield
A Austria 2010 1 0 0
A Austria 2010 0 1 0
A Austria 2010 1 0 0")
library(dplyr)
df2 %>%
group_by(Company, Country, Year) %>%
summarise_each(funs(sum), JointVenture:Greenfield) %>%
full_join(df, by = c("Company", "Country", "Year")) -> df
编辑作业:更换了summarise
与summarise_each
从@zacdav输入和更换merge
通过full_join
留在dplyr
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.