在R中（或在Matlab中）對XY坐標進行隨機采樣

Question

我的數據框有以下四列：類型（“A”或“B”），xvar，經度和緯度。 看起來像：

      type    xvar    longitude    latitude
[1,]   A       20      -87.81        40.11
[2,]   A       12      -87.82        40.12
[3,]   A       50      -87.85        40.22
....
[21,]  B       24      -87.79        40.04
[22,]  B       30      -87.88        40.10
[23,]  B       12      -87.67        40.32
[24,]  B       66      -87.66        40.44
....

我有20行type =“A”，25,000行type =“B”。 我的任務是將20個“A”數據點的xvar值隨機分配到“B”類型的XY空間而不進行替換。 例如，類型=“A”的第一次觀察中的xvar = 20可以隨機地位於[22，]中，即（-87.88,40.10）。 因為我在沒有替換的情況下這樣做，理論上，我可以執行此復制25,000 / 20 = 1,250次。 我想要1000復制。

我有一個函數（比如，myfunc（xvar，經度，緯度）），它從一個randome樣本中返回一個統計值。 我首先創建一個1,000x1的空矩陣（比如，myresult）。

myresult <- array(0,dim=c(1000,1))

然后，對於每個隨機樣本，我應用我的函數（myfunc）來計算統計量。

for (i in seq(1:1000)) {
  draw one sample, that has three variables: xvar, longitude, latitude.
  apply my function to this selected sample.
  store the calculated statistic in the myresult[i,]
}

我想知道如何在R中做到這一點（並且可能在Matlab中？）謝謝！

================================================== ===========

更新：@user。 借用你的想法，以下是我想要的：

dd1 <- df[df$type == "B" ,] 
dd2 <- df[df$type == "A" ,]
v   <- dd2[sample(nrow(dd2), nrow(dd2)), ]
randomXvarOfA <- as.matrix(v[,c("xvar")])  
cols <- c("longitude","latitude")
B_shuffled_XY <- dd1[,cols][sample(nrow(dd1), nrow(dd2)), ]
dimnames(randomXvarOfA)=list(NULL,c("xvar"))
sampledData <- cbind(randomXvarOfA,B_shuffled_XY)
sampledData

   xvar longitude latitude
4   20    -87.79    40.04
7   12    -87.66    40.44
5   50    -87.88    40.10

Answer 1

我認為您正在尋找的功能是'示例'功能。 它會像這樣工作（使用循環方法）：

drawn_Sample <- sample(21:25000, 20000, rep=FALSE)
myresult <- integer(1000)    

for (i in seq(1:1000){
index_Values <- (1 + (i-1)*20):(20 + (i-1)*20))
myresult[i] <- myfun(my_Data$xvar[1:20], my_Data$longitude[drawn_Sample[index_Values]], my_Data$latitude[drawn_Sample[index_Values]])
}

在這種情況下，我隨機將行1:20（值為“A”的行）分配給20個隨機選擇的行21：25000的組，然后在分組中應用該函數。

這感覺有點不必要的復雜，如果我們對你的功能有更多的了解（'myfun'），我想我們可以把它壓縮。 我假設它是矢量化的。

更新：根據OP的要求，我將添加如何修改此答案以適應不那么容易排序的數據幀。

repetitions <- 1000 # Change this as necessary

A_data <- my_Data[my_Data$type=="A",]
B_data <- my_Data[my_Data$type=="B",]

A_rows <- nrow(A_data)
B_rows <- nrow(B_data)

drawn_Sample <- sample(1:B_rows, repetitions * A_rows, rep=FALSE)
myresult <- integer(repetitions)    

for (i in seq(1:repetitions){
index_Values <- (1 + (i-1)*A_rows):(A_rows + (i-1)*A_rows))
myresult[i] <- myfun(A_data$xvar, B_data$longitude[drawn_Sample[index_Values]], B_data$latitude[drawn_Sample[index_Values]])
}

Answer 2

讀入您的數據：

  df<- read.table( text="
      type    xvar    longitude    latitude
      A       20      -87.81        40.11
      A       12      -87.82        40.12
      A       50      -87.85        40.22
      B       24      -87.79        40.04
      B       30      -87.88        40.10
       B       12      -87.67        40.32
      B       66      -87.66        40.44", header = TRUE)

我寫這篇文章沒有分裂，看起來很混亂。 所以我決定只分割你的data.frame 。

    dd1 <- df[df$type == "B" ,]  # get all rows of just type A
    dd2 <- df[df$type == "A" ,]  # get all rows of just type B

    v   <- dd2[sample(nrow(dd2), 2), ] #sample two rows at random that are type A
    # if you want to sample 20 rows change the 2 to a 20

    cols <- c("longitude", "latitude")
    dd1[,cols][sample(nrow(dd1), 2), ] <- v[,cols] 
    #Add the random long/lat selected from type As into 2 random long/lat of B


# put the As and Bs back together
rbind(dd2,dd1)
#  type xvar longitude latitude
# 1    A   20    -87.81    40.11
# 2    A   12    -87.82    40.12
# 3    A   50    -87.85    40.22
# 4    B   24    -87.79    40.04
# 5    B   30    -87.85    40.22
# 6    B   12    -87.81    40.11
# 7    B   66    -87.66    40.44

如您所見，B的第5行和第6行具有來自A類型的新隨機選擇的lat和long值。 我沒有更改xvar值。 我不知道你是否想要這個。 如果您確實想要更改xvars則可以將cols更改為cols <- c("xvar","longitude", "latitude") 。

在函數內部，它看起來像：

changestuff <-  function(x){

        dd1 <- x[x$type == "B" ,]  # get just A
        dd2 <- x[x$type == "A" ,]  # get just B
        v   <- dd2[sample(nrow(dd2), 2), ]
        cols <- c("longitude", "latitude")
        dd1[,cols][sample(nrow(dd1), 2), ] <- v[,cols] 
        rbind(dd2,dd1)
                            }

changestuff(df)

在R中（或在Matlab中）對XY坐標進行隨機采樣

問題描述

2 個解決方案

解決方案1
1 2013-01-30 18:10:51

解決方案2
1 已采納 2013-01-30 18:32:49

在R中（或在Matlab中）對XY坐標進行隨機采樣

問題描述

2 個解決方案

解決方案1 1 2013-01-30 18:10:51

解決方案2 1 已采納 2013-01-30 18:32:49

解決方案1
1 2013-01-30 18:10:51

解決方案2
1 已采納 2013-01-30 18:32:49