在for循环中嵌套ifelse语句

Question

I am trying a nested ifelse statement within a for loop to create a new variable, the values of which are based on the frequency of occurrence of a factor variable (a list of postcodes). 我正在尝试在for循环中使用嵌套的ifelse语句来创建一个新变量，其值基于因子变量（邮政编码列表）的出现频率。

The new variable should return a predefined series of numbers based on the frequency of a postcode (frequencies range between 1 and 4). 新变量应根据邮政编码的频率（频率范围在1到4之间）返回预定义的一系列数字。 Each of these number series must end in 800 and increase in increments of 200, the starting point of which depends on the frequency of each postcode: the higher the frequency, the lower the starting increment of 200. 这些数字系列中的每一个必须以800结尾，并以200为增量增加，其起始点取决于每个邮政编码的频率：频率越高，起始增量200越低。

For this I have defined a for loop, in which I first measure the frequency of each postcode, followed by a nested ifelse statement, specifying each series of numbers to be allocated to the NewVar based on the frequency. 为此，我定义了一个for循环，在该循环中，我首先测量每个邮政编码的频率，然后嵌套一个ifelse语句，根据频率指定要分配给NewVar的每个数字序列。

A small intuitive example of what I want to achieve is written here, I want to apply this on a dataframe containing millions of postcodes. 这里写了一个我想要实现的直观小示例，我想将此示例应用于包含数百万个邮政编码的数据框。

DESIRED RESULT: 所需结果：

Postcode  NewVar
AA        600
AA        800
BB        400
BB        600
BB        800
CC        800
DD        200
DD        400
DD        600
DD        800

CODE: 码：

DF$NewVar <- 0

DF$NewVar <- for (i in levels(DF$Postcode[i]))
ifelse((table(DF$Postcode[i]) == 4), DF$NewVar[i] <- c(200,400,600,800),
  (ifelse ((table(DF$Postcode[i]) == 3), DF$NewVar[i] <- c(400,600,800),
    (ifelse ((table(DF$Postcode[i]) == 2), DF$NewVar[i] <- c(600,800), 
      DF$NewVar[i] <- c(800))))))

PROBLEM 1: 问题1：

Firstly, when running the entire code, I receive an error stating that there is a mismatch between the amount of rows in the replacement versus the data, whilst when manually checking for this, it is not the case (the mismatch is always limited to exactly 1 row). 首先，在运行整个代码时，我收到一条错误消息，指出替换中的行数与数据之间存在不匹配，而在手动检查时，情况并非如此（不匹配始终限于完全匹配1行）。

Error in `$<-.data.frame`(`*tmp*`, NewVar, value = c("0", "0", "0",  : 
replacement has 11 rows, data has 10.

PROBLEM 2: 问题2：

TESTING IF AN IFELSE WORKS ON ITS OWN (OUT OF THE LOOP): 如果无法进行自己的工作，请进行以下测试：

When verifying if the ifelse clause works on its own (outside of the loop), I see that only the starting increment of 200 is copied on each line of NewVar, so it does not increment to 800. This is not what I want to achieve either: 当验证ifelse子句是否可以单独工作时（在循环外部），我看到NewVar的每一行仅复制了200的开始增量，因此它不会增加到800。这不是我想要实现的目标之一：

CODE TESTING ONE IFELSE: 一项代码测试：

DF$NewVar[1:2] <- ifelse((sum(table(DF$Postcode)) == 2),                       
  DF$NewVar[1:2] <- c(600,800), "NA")

RESULT (not desired): 结果（不期望）：

Postcode  NewVar
AA        200
AA        200

DESIRED RESULT: 所需结果：

Postcode  NewVar
AA        200
AA        400

Note: I predefined the NewVar column before trying to allocated the variable, and I have checked for NA´s already as well. 注意：在尝试分配变量之前，我预定义了NewVar列，并且我也已经检查了NA。

Thank you in advance for your time. 预先感谢您的宝贵时间。

Answer 1

One way if you're willing to use dplyr : 如果您愿意使用dplyr的一种方法：

library(dplyr)
DF <- structure(list(Postcode = c("AA", "AA", "BB", "BB", "BB", "CC", 
"DD", "DD", "DD", "DD")), class = "data.frame", row.names = c(NA, 
-10L))

vals <- c(200,400,600,800)
DF %>% group_by(Postcode) %>% mutate(NewVar = tail(vals,n()))

Answer 2

For the sake of completeness, here is a base R solution which uses the ave() function. 为了完整起见，这是使用ave()函数的基本R解决方案。

Let's assume Postcode is a vector of postcodes in random order: 假设Postcode是随机顺序的邮政编码向量：

Postcode

  [1] "BB" "CC" "CC" "BB" "BB" "AA" "CC" "BB" "AA" "DD"

the code below creates a data.frame including Postcode and NewVar : 下面的代码创建一个data.frame，其中包括Postcode和NewVar ：

data.frame(
  Postcode, 
  NewVar = ave(Postcode, Postcode, 
               FUN = function(x) seq(to = 800, by = 200, length.out = length(x)))
)

  Postcode NewVar 1 BB 200 2 CC 400 3 CC 600 4 BB 400 5 BB 600 6 AA 600 7 CC 800 8 BB 800 9 AA 800 10 DD 800

Data 数据

# create data
library(magrittr)   # only used to improve readability
n_codes <- 4L
set.seed(1L)
Postcode <- 
  stringr::str_dup(LETTERS[1:n_codes], 2L) %>% # create codes
  rep(times = sample(n_codes)) %>%             # replicate randomly
  sample()                                     # re-order randomly

在for循环中嵌套ifelse语句

问题描述

2 个解决方案

解决方案1
1 已采纳 2019-01-10 20:47:52

解决方案2
0 2019-01-13 10:43:53

Data 数据

在for循环中嵌套ifelse语句

问题描述

2 个解决方案

解决方案1 1 已采纳 2019-01-10 20:47:52

解决方案2 0 2019-01-13 10:43:53

Data 数据

解决方案1
1 已采纳 2019-01-10 20:47:52

解决方案2
0 2019-01-13 10:43:53