[英]R mutate variable to variable values from another observation, using a loop, an ifelse condition and subset (dplyr)
see my reproducible and desired output below.请参阅下面的可重现和所需的 output。
I want to create a new variable, where I combine variable values from other observations (rows), which I want to identify in a loop using subset.我想创建一个新变量,在其中组合来自其他观察(行)的变量值,我想使用子集在循环中识别这些值。 The condition of the subset is to be defined by the loop.
子集的条件由循环定义。 In example 1
subset(df, country == i)
does not work, but doing it manually (in Ex.2) subset(df, country == 'US')
works.在示例 1
subset(df, country == i)
不起作用,但手动(在 Ex.2 中) subset(df, country == 'US')
有效。 I thought country == i
and country == 'US'
should be pretty much the same.我认为
country == i
和country == 'US'
应该几乎相同。
# create a df
country <- c('US', 'US', 'China', 'China')
Trump_virus <- c('Y', 'N' ,'Y', 'N')
cases <- c (1000, 2000, 4, 6)
df <- data.frame(country, Trump_virus, cases)
#################################################### Ex.1
for (i in df$country) {
print(i)
df <- df %>%
mutate(cases_corected = ifelse(
Trump_virus == 'Y'
,subset(df, Trump_virus == 'N' & country == i)$cases*1000
,'killer_virus'
))}
##
df$cases_corected
#################################################### Ex.2
for (i in df$country) {
print(i)
df <- df %>%
mutate(cases_corected = ifelse(
Trump_virus == 'Y'
,subset(df, Trump_virus == 'N' & country == 'US')$cases*1000
,'killer_virus'
))}
##
df$cases_corected
################################################### Desired output
> df$cases_corected
[1] "2e+06"
[2] "killer_virus"
[3] "6000"
[4] "killer_virus"
Here is a solution with dplyr
.这是
dplyr
的解决方案。 Updated based on the change in desired output根据所需 output 中的更改进行更新
df <- df %>%
mutate(country=toupper(country)) # to get same names for other variants of a country #e.g. China and china
#genearting a dataset which have cases only for Trump_virus==N
df1<-df %>%
dplyr::filter(Trump_virus=="N") %>%
dplyr::mutate(ID= "Y",
cases_corected=cases*1e3) %>%
dplyr::select(-c(cases,Trump_virus))
# final merging
df<-df %>%
left_join(df1,by=c("country"="country","Trump_virus"="ID")) %>%
mutate(cases_corected=ifelse(is.na(cases_corected),'killer_virus',cases_corected))
df
country Trump_virus cases cases_corected
1 US Y 1000 2e+06
2 US N 2000 killer_virus
3 CHINA Y 4 6000
4 CHINA N 6 killer_virus
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.