使用 dplyr 在 for 循環內按行操作

Question

我有一些傳輸數據，如果在 for 循環中進行比較，我想按行執行。 數據看起來像這樣。

# Using the iris dataset 
> iris <- as.data.frame(iris)
> head(iris)
  Sepal.Length Sepal.Width Petal.Length Petal.Width Species
1          5.1         3.5          1.4         0.2  setosa
2          4.9         3.0          1.4         0.2  setosa
3          4.7         3.2          1.3         0.2  setosa
4          4.6         3.1          1.5         0.2  setosa
5          5.0         3.6          1.4         0.2  setosa
6          5.4         3.9          1.7         0.4  setosa

結果將記錄每個物種中花瓣寬度相等的萼片長度的實例。 這樣我們就記錄了花瓣寬度相等的萼片長度對（這只是一個沒有科學意義的說明）。 這會產生這樣的結果：

Species Petal.Width Sepal.Length1 Sepal.Length2
setosa          0.2         5.1             4.9
setosa          0.2         5.1             4.7
setosa          0.2         4.9             4.7
setosa          0.2         5.1             4.6
...

我最初的 Python-ish 想法是在 for 循環中執行 for 循環，看起來像這樣：

for s in unique(Species):
  for i in 1:nrow(iris):
    for j in 1:nrow(iris):
      if iris$Petal.Width[i,] == iris$Petal.Width[j,]:
        Output$Species = iris$Species[i,]
        Output$Petal.Width = iris$Petal.Width[i,]
        Output$Sepal.Length1= iris$Sepal.Length[i,]
        Output$Sepal.Length2= iris$Sepal.Length[j,]
    end
  end
end

我曾考慮過使用group_by先對Species進行分類，以實現第一個 for 循環for s in unique(Species): 。 但我不知道如何逐行比較數據集中的每個觀察值，並像第二個代碼塊一樣存儲它。 我在 dplyr 和rowwise quantity中看到了有關for 循環的問題。 如果上面的代碼不是很清楚，我很抱歉。 第一次在這里提問。

Answer 1

使用dplyr ：

library(dplyr)    

iris %>%
      group_by(Species,Petal.Width) %>%
      mutate(n = n()) %>%
      filter(n > 1) %>%
      mutate(Sepal.Length1 = Sepal.Length,
             Sepal.Length2 = Sepal.Length1 - Petal.Width) %>%
      arrange(Petal.Width) %>%
      select(Species, Petal.Width, Sepal.Length1, Sepal.Length2)

這是對Species和Petal.Width進行分組，計算它們相同的實例，只選擇有超過 1 個唯一配對的情況，然后將Sepal.Length重命名為Sepal.Length1 ，並創建一個新變量Sepal.Length2 = Sepal.Length1 - 花瓣. Petal.Width

用於記錄定義范圍內每個Species的Sepal.Length ：

minpw <- min(Petal.Width)
maxpw <- max(Petal.Width)

iris %>%
  group_by(Sepal.Length, Species, petal_width_range = cut(Petal.Width, breaks = seq(minpw,maxpw,by=0.2))) %>%
  summarise(count = n())

使用 dplyr 在 for 循環內按行操作

問題描述

1 個解決方案

解決方案1
1 已采納 2020-04-22 13:34:41

使用 dplyr 在 for 循環內按行操作

問題描述

1 個解決方案

解決方案1 1 已采納 2020-04-22 13:34:41

解決方案1
1 已采納 2020-04-22 13:34:41