简体   繁体   English

将基于Y列的X列中的值替换为R

[英]replacing a value in column X based on columns Y with R

i've gone through several answers and tried the following but each either yields an error or an un-wanted result: 我已经经历了几个答案,并尝试了以下方法,但每个方法都会产生错误或不需要的结果:

here's the data: 这是数据:

Network                 Campaign
Moburst_Chartboost      Test Campaign
Moburst_Chartboost      Test Campaign 
Moburst_Appnext         unknown
Moburst_Appnext         1065

i'd like to replace "Test Campaign" with "1055" whenever "Network" == "Moburst_Chartboost". 每当“网络” ==“ Moburst_Chartboost”时,我想将“测试活动”替换为“ 1055”。 i realize this should be very simple but trying out these: 我意识到这应该非常简单,但可以尝试以下方法:

dataset = read.csv('C:/Users/User/Downloads/example.csv')
for( i in 1:nrow(dataset)){
  if(dataset$Network == 'Moburst_Chartboost') dataset$Campaign <- '1055'
}

this yields an error: Warning messages: 这将产生一个错误:警告消息:

1: In if (dataset$Network == "Moburst_Chartboost") dataset$Campaign <- "1055" :
  the condition has length > 1 and only the first element will be used
2: In if (dataset$Network == "Moburst_Chartboost") dataset$Campaign <- "1055" :
  the condition has length > 1 and only the first element will be used
etc.

then i tried: 然后我尝试了:

within(dataset, {
  dataset$Campaign <- ifelse(dataset$Network == 'Moburst_Chartboost', '1055', dataset$Campaign)
})

this turned ALL 4 values in row "Campaign" into "1055" over running what was there even when condition isn't met 这样,即使不满足条件,也可以通过运行“活动”行中的所有4个值将其变为“ 1055”

also this: 还有这个:

dataset$Campaign[which(dataset$Network == 'Moburst_Chartboost')] <- 1055

yields this error, and replaced the values in the two first rows of "Campaign" with NA: 会产生此错误,并将“广告系列”的第一两行中的值替换为NA:

Warning message:
In `[<-.factor`(`*tmp*`, which(dataset$Network == "Moburst_Chartboost"),  :
  invalid factor level, NA generated

scratching my head here. 在这里挠头。 new to R but this shouldn't be so hard :( R的新手,但这不应该那么难:(

Try the following 尝试以下

dataset = read.csv('C:/Users/User/Downloads/example.csv', stringsAsFactors = F)
for( i in 1:nrow(dataset)){
  if(dataset$Network[i] == 'Moburst_Chartboost') dataset$Campaign[i] <- '1055'
}

It seems your forgot the index variable. 看来您忘记了索引变量。 Without [i] you work on the whole vector of the data frame, resulting in the error/warning you mentioned. 如果没有[i],您将处理数据帧的整个向量,从而导致您提到的错误/警告。 Note that I added stringsAsFactors = F to the read.csv() function to make sure the strings are indeed interpreted as strings and not factors. 请注意,我在read.csv()函数中添加了stringsAsFactors = F以确保字符串确实被解释为字符串而不是因素。 Using factors this would result in an error like this 使用因素会导致这样的错误

In `[<-.factor`(`*tmp*`, i, value = c(NA, 2L, 3L, 1L)) :
invalid factor level, NA generated

Alternatively you can do the following without using a for loop: 另外,您可以不使用for循环而执行以下操作:

idx <- which(dataset$Network == 'Moburst_Chartboost')
dataset$Campaign[idx] <- '1055'

Here, idx is a vector containing the positions where Network has the value 'Moburst_Chartboost' 在此, idx是一个矢量,其中包含Network的值为'Moburst_Chartboost'

In your first attempt, you're trying to iterate over all the columns, when you only want to change the 2nd column. 在第一次尝试中,当您只想更改第二列时,您尝试遍历所有列。

In your second, you're trying to assign the value "1055" to all of the 2nd column. 在第二个步骤中,您尝试将值“ 1055”分配给所有第二列。

The way to think about it is as an if else, where if the condition in col 1 is met, col 2 is changed, otherwise it remains the same. 考虑它的方式就好像是其他情况一样,如果满足col 1中的条件,则col 2会更改,否则保持不变。

dataset <- data.frame(Network = c("Moburst_Chartboost", "Moburst_Chartboost", 
                              "Moburst_Appnext", "Moburst_Appnext"),
                  Campaign = c("Test Campaign", "Test Campaign",
                               "unknown", "1065"))

dataset$Campaign <- ifelse(dataset$Network == "Moburst_Chartboost",
                       "1055",
                       dataset$Campaign)

head(dataset)
Network Campaign
1 Moburst_Chartboost     1055
2 Moburst_Chartboost     1055
3    Moburst_Appnext  unknown
4    Moburst_Appnext     1065

You may also try dataset$Campaign[dataset$Campaign=="Test Campaign"]<-1055 to avoid the use of loops and ifelse statements. 您也可以尝试使用dataset$Campaign[dataset$Campaign=="Test Campaign"]<-1055 ifelse以避免使用循环和ifelse语句。

Where dataset dataset在哪里

dataset <- data.frame(Network = c("Moburst_Chartboost", "Moburst_Chartboost", 
                              "Moburst_Appnext", "Moburst_Appnext"),
                  Campaign = c("Test Campaign", "Test Campaign",
                               "unknown", 1065))

thank you for the help! 感谢您的帮助! not elegant, but since this lingered with me when going to sleep last night i decided to try to bludgeon this with some ugly code but it worked too - just as a workaround...separated to two data frames, replaced all values and then binded back... 不是很优雅,但是由于昨晚睡觉时这困扰我,我决定尝试用一些丑陋的代码来解决这个问题,但是它也可以工作-就像一个解决方法...分离为两个数据帧,替换所有值然后绑定背部...

# subsetting only chartboost    
chartboost <- subset(dataset, dataset$Network=='Moburst_Chartboost')
# replace all values in Campaign
chartboost$Campaign <-sub("^.*", "1055",chartboost$Campaign)
#subsetting only "not chartboost"
notChartboost <-subset(dataset, dataset$Network!='Moburst_Chartboost')
# binding back to single dataframe
newSet <- rbind(chartboost, notChartboost)

Ugly as a duckling but worked :) 丑小鸭,但工作:)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM