简体   繁体   English

在 R 中,当替换列不为空时,如何将一列中的值替换为另一列的值?

[英]In R, how can I replace the values in one column with the values of another column whenever the replacing column is not empty?

I'm working on automating a report in R.我正在自动化 R 中的报告。 One thing we do is look at clients who enter a shop and estimate their age record that as Estimate Age .我们要做的一件事是查看进入商店的客户并估计他们的年龄记录,即Estimate Age We are instructed to use the Estimated Age as Age in the report so that all parties have an age.我们被指示在报告中使用估计年龄作为年龄,以便所有各方都有一个年龄。 If we are able to record someone's Actual Age that age then becomes the Age .如果我们能够记录某人的实际年龄,那么那个年龄就变成了年龄 Most records do not have a value for Actual Age .大多数记录没有实际年龄值。 For the records that do have an Actual Age value I need to replace the Estimated Age value with the Actual Age value whenever it exists.对于确实具有实际年龄值的记录,我需要将估计年龄值替换为实际年龄值(只要它存在)。 The records without Actual Age should remain unchanged.没有实际年龄的记录应保持不变。

I'm a newbie and have been stuck on this step for months.我是一个新手,几个月来一直坚持这一步。 Asking the stackoverflow gods for a blessing.向 stackoverflow 大神求个祝福。 See image if it helps.如果有帮助,请查看图片。

Replacing Estimate Age with Actual Age用实际年龄代替估计年龄

I've already tried: Tried several variations of 2 different methods for replacing Estimated Age with Actual Age, again to no avail:我已经尝试过:尝试了两种不同方法的几种变体,用实际年龄替换估计年龄,再次无济于事:

1) Age <- ifelse(is.null(MyReport$ActualAge), MyReport$ActualAge, MyReport$EstimatedAge) 1) Age <- ifelse(is.null(MyReport$ActualAge), MyReport$ActualAge, MyReport$EstimatedAge)

View(MyReport) 2) Also something like this but I tweaked it so much so not exactly like this I messed it up View(MyReport) 2) 也是这样的,但我调整了很多,所以不完全像这样,我把它搞砸了

select <- is.null(MainReportload$ActualAge) < 0.01
df[select,MyReport$EstimatedAge] <- df[select, MyReport$ActualAge]

3) 3)

if(is.null(MyReport$ActualAge)) {
  MyReport$Age <- MyReport$EstimatedAge
} else {
  MyReport$Age <- MyReport$ActualAge
  }
MyReport$Age
View(MyReport)

8.6.19 Alternative based on brain and minimal SQL knowledge, just do a coalesce, coalesce is available in the dplyr library. 8.6.19 基于大脑的替代方案和最小的 SQL 知识,只需做一个合并,合并在 dplyr 库中可用。 Result: same issue as the above attempt will continue with research.结果:与上述尝试相同的问题将继续研究。

I'm not sure why you say that dplyr::coalesce doesn't work, here's a simplified generic example of it.我不确定你为什么说dplyr::coalesce不起作用,这是一个简化的通用示例。 If you post a reproducible version of your data we can help more.如果您发布数据的可重现版本,我们可以提供更多帮助。 The key to coalesce is that it returns the first non missing value.合并的关键是它返回第一个非缺失值。 So coalesce(estimated_age, age),= coalesce(age, estimated_age)所以coalesce(estimated_age, age),= coalesce(age, estimated_age)

# example data
df <- readr::read_csv("
age, estimated_age
12, 14
NA, 13
NA, NA
15, NA
")

# coalesce
df2 <- dplyr::mutate(df, new_age = dplyr::coalesce(age, estimated_age))

Part of the problem was the age values were stored as a range always leading with zero.部分问题是年龄值存储为始终以零开头的范围。 So a 24 year old would have age as 0-24.因此,24 岁的年龄为 0-24。 I couldn't use the dplyr::coalesce solution until I fixed that price.在我确定该价格之前,我无法使用 dplyr::coalesce 解决方案。 Also, it didn't work with the mutate piece so I took that off.此外,它不适用于变异片段,所以我把它拿掉了。

Below is what finally worked for me!以下是最终对我有用的方法!

    #USING ACTUAL AGE WHENEVER IT IS PRESENT AND ESTIMATED AGE WHEN THERE ISN'T AN ACTUAL AGE
#Remove hyphens from age fields and store the column as integers

main_df$Actual.Age <- as.integer( gsub("-", "", main_df$Actual.Age))
main_df$EstimatedAge <- as.integer( gsub("-", "", main_df$EstimatedAge))


#Use Coalesce to create a new column that contains the NEW Age values. 
main_df$new_EstimatedAge <- dplyr::coalesce(main_df$Actual.Age, 
main_df$EstimatedAge)
#view(head(main_df$new_EstimatedAge, 30))

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM