简体   繁体   中英

create multiple columns with each column as a sequence of numbers in R

Problem: I wanted to add three columns in my data frame with each column being a sequence of numbers. But I want each column to vary with the other column. So here's an example data frame:

data <- read.table(text="
group1  group2  rate
A     D     0.01     
A     D     0.001
A     D     0.0001  
B     D     0.01    
B     D     0.001      
B     D     0.0001
D     A     0.01     
D     A     0.001
D     A     0.0001  
D     B     0.01    
D     B     0.001      
D     B     0.0001",
                   header=TRUE)

So first I extended my data frame to accommodate the combinations of numbers that I want for the 3 columns. I used 125 because I have 5 numbers for each sequence.

dataext <- data[rep(seq_len(nrow(data)), 125), ]

Then, I created my new column using the sequence of number that I want:

dataext$var1 <- rep_len (seq(0,1, 0.25), length.out=125)
dataext$var2 <- rep_len (seq(0,1, 0.25), length.out=125)
dataext$var3 <- rep_len (seq(0,1, 0.25), length.out=125)

An example of my desired output is:

group1  group2  rate    var1    var 2   var3
    A     D     0.01     0      0       0           
    A     D     0.001    0      0       0               
    A     D     0.0001   0      0       0
    A     D     0.01     0.25   0       0           
    A     D     0.001    0.25   0       0               
    A     D     0.0001   0.25   0       0
    A     D     0.01     0.25   0.25    0           
    A     D     0.001    0.25   0.25    0               
    A     D     0.0001   0.25   0.25    0
    A     D     0.01     0.25   0.25    0.25            
    A     D     0.001    0.25   0.25    0.25                
    A     D     0.0001   0.25   0.25    0.25

I hope this is clear enough. Any leads on how to do it right are greatly appreciated. Thanks!

I cannot comment yet to ask for clarification, but it appears that you want every combination between group1, group2, rate, var1, var2, and var3.

You can use expand.grid to achieve this.

data <- read.table(text="
group1  group2  rate
                   A     D     0.01     
                   A     D     0.001
                   A     D     0.0001  
                   B     D     0.01    
                   B     D     0.001      
                   B     D     0.0001
                   D     A     0.01     
                   D     A     0.001
                   D     A     0.0001  
                   D     B     0.01    
                   D     B     0.001      
                   D     B     0.0001",
                   header=TRUE)

g1 <- levels(data$group1)
g2 <- levels(data$group2)
r <- levels(factor(data$rate))
var1 <- var2 <- var3 <- factor(seq(0,1,0.25))

dataout <- expand.grid(g1,g2,r,var1,var2,var3)

colnames(dataout) <- c("group1", "group2", "rate","var1","var2","var3")

View(dataout)

If you are just looking for the specific combinations you already have for group1, group2, and rate you can make a new column indicating those unique combinations and run the expand.grid

data <- read.table(text="
group1  group2  rate
                   A     D     0.01     
                   A     D     0.001
                   A     D     0.0001  
                   B     D     0.01    
                   B     D     0.001      
                   B     D     0.0001
                   D     A     0.01     
                   D     A     0.001
                   D     A     0.0001  
                   D     B     0.01    
                   D     B     0.001      
                   D     B     0.0001",
                   header=TRUE)
dataext <- data[rep(seq_len(nrow(data)), 125), ]

data$key <- seq(1:length(data$group1))

dataout2 <- expand.grid(data$key,var1,var2,var3)
colnames(dataout2) <- c("key","var1","var2","var3")

datafin <- cbind(dataext,dataout2[2:4])

View(datafin)

So, I forgot that there is an expand.grid option for checking column combinations. Here's how I got the data frame that I want.

a <- list (var1 = (seq(0,1, 0.25)), var2 = (seq(0,1, 0.25)), var3 = (seq(0,1, 0.25)) )
c<- expand.grid(a)
expv<- c[rep(seq_len(nrow(c)), 12), ]

dataext$var1 <- expv$var1
dataext$var2 <- expv$var2
dataext$var3 <- expv$var3

I checked manually the resulting data frame but I also tried to plot it.

datamelt <- melt(dataext, id.vars = c ("group1", "group2", "rate"), value.name= "val", variable.name ="varsname" )

ggplot(datamelt, aes(x=as.factor(rate), y=val, color=varsname)) + geom_point(position=position_jitterdodge()) + facet_grid(group1~group2)

I think it worked. :)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM