I've got dataframe like this
#dt
# one two row MAX_row three four
#1: a 1 0 2 yes yes
#2: a 2 2 2 yes yes
#3: a 3 0 2 no yes
#4: b 1 0 5 yes no
#5: b 2 5 5 no no
#6: b 3 0 5 no no
to create variables row and MAX_row I've produced code as follows:
dt$row <-ifelse(dt$two == 2,rownames(dt), 0)
dt <- dt %>% group_by(one) %>% mutate(MAX_row=max(row))
and what I'm trying now to do is to fill out column four with values from row in column three. Row numbers are indicated in colum MAX_row. So, in column four for row with 'a' in column one should be values from row number 2 in column three, like I've showed in dt. I thought code as follows would be ok, but it produce odd values:
dt$four <- ifelse(dt$one=='a',dt$three[dt$MAX_row],0)
Any ideas?
If I am understanding it correctly you start with three columns one
, two
and three
and I think row
and MAX_row
are temporary variables created to reach till four
.
We can get the expected output without the need to create those variables.
library(dplyr)
df %>%
group_by(one) %>%
mutate(four = three[which.max(two == 2)])
# one two three four
# <fct> <int> <fct> <fct>
#1 a 1 yes yes
#2 a 2 yes yes
#3 a 3 no yes
#4 b 1 yes no
#5 b 2 no no
#6 b 3 no no
This still gives your expected output without creating row
and MAX_row
.
data
df <- structure(list(one = structure(c(1L, 1L, 1L, 2L, 2L, 2L), .Label
= c("a",
"b"), class = "factor"), two = c(1L, 2L, 3L, 1L, 2L, 3L), three =
structure(c(2L,
2L, 1L, 2L, 1L, 1L), .Label = c("no", "yes"), class = "factor")),
row.names = c("1:",
"2:", "3:", "4:", "5:", "6:"), class = "data.frame")
Best to not mix data.table
and dplyr
syntax. Since dt
appears to be a data.table
here is a data.table
solution
dt[
, row := ifelse(two == 2, .I, 0)][,
, MAX_row := max(row), by = one][,
, four := ifelse(one == "a", three[MAX_row], 0)]
# one two row MAX_row three four
#1: a 1 0 2 yes yes
#2: a 2 2 2 yes yes
#3: a 3 0 2 no yes
#4: b 1 0 5 yes no
#5: b 2 5 5 no no
#6: b 3 0 5 no no
Or all in one step avoiding the generation of row
and MAX_row
(as highlighted by Ronak)
dt[, four := three[which.max(two == 2)], by = one]
# one two row MAX_row three four
#1: a 1 0 2 yes yes
#2: a 2 2 2 yes yes
#3: a 3 0 2 no yes
#4: b 1 0 5 yes no
#5: b 2 5 5 no no
#6: b 3 0 5 no no
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.