简体   繁体   中英

R extract value from column and row in dataframe

I've got dataframe like this

#dt
#   one two row MAX_row three four
#1: a   1   0   2       yes   yes
#2: a   2   2   2       yes   yes
#3: a   3   0   2       no    yes
#4: b   1   0   5       yes   no
#5: b   2   5   5       no    no
#6: b   3   0   5       no    no

to create variables row and MAX_row I've produced code as follows:

dt$row <-ifelse(dt$two == 2,rownames(dt), 0)
dt <- dt %>% group_by(one) %>% mutate(MAX_row=max(row))

and what I'm trying now to do is to fill out column four with values from row in column three. Row numbers are indicated in colum MAX_row. So, in column four for row with 'a' in column one should be values from row number 2 in column three, like I've showed in dt. I thought code as follows would be ok, but it produce odd values:

dt$four <- ifelse(dt$one=='a',dt$three[dt$MAX_row],0)

Any ideas?

If I am understanding it correctly you start with three columns one , two and three and I think row and MAX_row are temporary variables created to reach till four .

We can get the expected output without the need to create those variables.

library(dplyr)

df %>%
  group_by(one) %>%
  mutate(four = three[which.max(two == 2)])

#  one     two three four 
#  <fct> <int> <fct> <fct>
#1  a         1 yes   yes  
#2  a         2 yes   yes  
#3  a         3 no    yes  
#4  b         1 yes   no   
#5  b         2 no    no   
#6  b         3 no    no   

This still gives your expected output without creating row and MAX_row .

data

df <- structure(list(one = structure(c(1L, 1L, 1L, 2L, 2L, 2L), .Label 
 = c("a", 
"b"), class = "factor"), two = c(1L, 2L, 3L, 1L, 2L, 3L), three = 
structure(c(2L, 
2L, 1L, 2L, 1L, 1L), .Label = c("no", "yes"), class = "factor")), 
row.names = c("1:", 
"2:", "3:", "4:", "5:", "6:"), class = "data.frame")

Best to not mix data.table and dplyr syntax. Since dt appears to be a data.table here is a data.table solution

dt[
    , row := ifelse(two == 2, .I, 0)][,
    , MAX_row := max(row), by = one][,
    , four := ifelse(one == "a", three[MAX_row], 0)]
#   one two row MAX_row three four
#1:   a   1   0       2   yes  yes
#2:   a   2   2       2   yes  yes
#3:   a   3   0       2    no  yes
#4:   b   1   0       5   yes   no
#5:   b   2   5       5    no   no
#6:   b   3   0       5    no   no

Or all in one step avoiding the generation of row and MAX_row (as highlighted by Ronak)

dt[, four := three[which.max(two == 2)], by = one]
#   one two row MAX_row three four
#1:   a   1   0       2   yes  yes
#2:   a   2   2       2   yes  yes
#3:   a   3   0       2    no  yes
#4:   b   1   0       5   yes   no
#5:   b   2   5       5    no   no
#6:   b   3   0       5    no   no

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM