简体   繁体   中英

R iterate over a data frame to add a new column with sequential values

Here is my data frame "data.frame"

    X   Y
1   10  12
2   20  22
3   30  32

Below what I want.
1) add a new colum named "New_col"
2) each cell of a given id is a sequence from X-value to Y-value (step of 1).

    X   Y   New_col
1   10  12  10
            11
            12
2   20  22  20
            21
            22
3   30  32  30
            31
            32

Then fill the empty cells

    X   Y   New_col
1   10  12  10
1   10  12  11
1   10  12  12
2   20  22  20
2   20  22  21
2   20  22  22
3   30  32  30
3   30  32  31
3   30  32  32

I tried the following:

  New_col<-seq(from = data.frame$X, to = data.frame$Y, by = 1)

The problem it this code computes the sequence only for the first row. Then I tried a loop:

for (i in 1: length(data.frame$X))
{
  New_col <-seq(from = data.frame$X, to = data.frame$Y, by = 1)
}

This is the error I got:

Error in seq.default(from = data.frame$X, to = data.frame$Y, by = 1) :
'from' must be of length 1

Thank you for your help.

You can use apply :

do.call(rbind, apply(dat, 1, function(x) 
                      data.frame(X = x[1], Y = x[2], New_col = seq(x[1], x[2]))))

where dat is the name of your data frame. You can ignore the warnings.

     X  Y New_col
1.1 10 12      10
1.2 10 12      11
1.3 10 12      12
2.1 20 22      20
2.2 20 22      21
2.3 20 22      22
3.1 30 32      30
3.2 30 32      31
3.3 30 32      32

This is a good use case for the data.table package (which you would have to install first):

dat = read.table(text="    X   Y
1   10  12
2   20  22
3   30  32")

library(data.table)
dt = as.data.table(dat)

Once you've got your data table set up, by makes this operation easy:

dt2 = dt[, list(New_col=seq(X, Y)), by=c("X", "Y")]
#     X  Y New_col
# 1: 10 12      10
# 2: 10 12      11
# 3: 10 12      12
# 4: 20 22      20
# 5: 20 22      21
# 6: 20 22      22
# 7: 30 32      30
# 8: 30 32      31
# 9: 30 32      32

(The only disclaimer is that this will not work if there are duplicate (X, Y) pairs in your original data frame).

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM