I would like to be able create a new dataframe with 6 columns from an existing dataframe with 4 columns. The two extra columns should be the value of the counters (i and j) whilst the loop is working.
my draft code is as follows
a is binary,
b is categorical
c is a number (in this case 1 to 200)
d is a number (in this example 1 to 5, in real life 1 to 2500)
#### make an example of mydata
a<- c(0,0,0,0,0,0,0,0,0,0,1,1,0,1)
b<- c("a","b","a","b","b","c","a","e","c","a","a","b","d","f")
c<- c(20,30,40,40,54,76,23,23,78,23,34,1,88,1)
d<- c(1,1,1,2,2,2,3,3,4,5,5,5,5,5)
mydata<-data.frame(a,b,c,d)
## this just generates random numbers to randomly
##select row to bind together later
set.seed(1)
choose.test<- data.frame(matrix(NA, nrow = 20, ncol = 30))
for (i in 1:20)
{
choose.test[,i]<-sample(5, 20, replace = TRUE, prob = NULL)
#random selction of sites WITH replacment
}
# this is the bit I am having trouble with
data<- NULL
for( j in 1:10){
for (i in choose.test[,j])
{ data <- rbind(data, mydata[mydata[,4]== i,])
data[,5]<-j
data[,6]<-i
}}
It would also be acceptable to create separate dataframes at each loop iteration (in the second loop using i as a counter), or open to other better suggestions as I am new to r. I also tried using assign
to do this with no luck.
At each iteration I need to rbind together all the rows in column 4 which have a value equal to a random number between 1 and 5 ( in this example anyway in real life it will be between 1 and 2500 sites). These random numbers are stored in a data frame, called choose.test
, where the random numbers in each column is used only once then the next iteration moves onto the next column.
Without the "data[,5]<-j data[,6]<-i" it does what almost what I want , but I would really like to have a 5th and 6th column that identify which iteration of the i and j loop the rows were from so I can analyse the data at each iteration (I am bootstrapping with this data). Clearly the code above does not work, but I am not sure how to get it to do what I want. In the current version it just add the maximum counter value to all rows at columns 5 and 6.
Many thanks, Ben
The following code fixed my problem
data<- NULL
for( j in 1:10){
for (i in choose.test[,j])
{ data <- rbind(data, cbind(mydata[mydata[,4]== i,], i=i, j=j))}}
Credit goes to MrFlick for providing a useful comment!
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.