简体   繁体   English

如何根据 R 中超过 1 列的条件对 dataframe 行重新排序?

[英]How to reorder dataframe rows in based on conditions in more than 1 column in R?

The Problem问题

I am trying to reorder rows based on the conditions in 2 other columns.我正在尝试根据其他 2 列中的条件对行重新排序。 Specifically, I have a sequential ID for hundreds of randomly generated sampling transects called "ID_First" and then for each transect there is a corresponding "ID_Next" that represents the next transect that should be sampled.具体来说,我有数百个随机生成的采样样带的顺序 ID,称为“ID_First”,然后对于每个样带都有一个相应的“ID_Next”,它代表应该采样的下一个样带。 I am trying to reorder the rows such that the sampling transects are in order of execution rather than the original order based on "ID_First"我正在尝试对行重新排序,以便采样横断面按照执行顺序而不是基于“ID_First”的原始顺序

I know that data frames can be arranged based on one or more columns for numerical variables in either an ascending or descending way and, for factors, in an "ordered" way (eg, high, medium, low).我知道数据框可以基于数值变量的一列或多列以升序或降序方式排列,对于因子,可以以“有序”方式(例如,高、中、低)排列。 Is it possible to arrange the order of the rows based on the sequence of ID_first and then ID_Next?是否可以根据 ID_first 然后 ID_Next 的顺序排列行的顺序? I have not been able to figure out how to do this so I have been doing it manually.我一直无法弄清楚如何做到这一点,所以我一直在手动进行。

Simplified Reproducible Example简化的可重现示例

Data数据

# sequential ID for a small number of randomly generated transects 
ID_First <- seq(1,10,1)

# represents the next transect that should be sampled following ID_First
ID_Next <- c(4,5,8,7,10,2,9,6,3,NA)


# make a dataframe
df <- cbind.data.frame(ID_First, ID_Next)

# look at the df
df

>    ID_First ID_Next
> 1         1       4
> 2         2       5
> 3         3       8
> 4         4       7
> 5         5      10
> 6         6       2
> 7         7       9
> 8         8       6
> 9         9       3
> 10       10      NA

So, if you start with ID_First equal to 1 and then look at the corresponding ID_Next this would indicate that the next transect to sample is 4. Then you go to ID_First equal to 4 and the corresponding ID_Next to sample next would be 7, and so on.因此,如果您从 ID_First 等于 1 开始,然后查看相应的 ID_Next,这将表明要采样的下一个样带是 4。然后您将 go 到 ID_First 等于 4,对应的 ID_Next 到下一个采样将为 7,所以上。 For this example, the order of sampling would progress as follows: 1,4,7,9,3,8,6,2,5,10.对于此示例,采样顺序将按以下顺序进行:1、4、7、9、3、8、6、2、5、10。

Ideal Outcome理想结果

Here is what I am trying to accomplish:这是我想要完成的事情:

>    ID_First ID_Next
> 1         1       4
> 4         4       7
> 7         7       9
> 9         9       3
> 3         3       8
> 8         8       6
> 6         6       2
> 2         2       5
> 5         5      10
> 10       10      NA

Now the transects are following the order needed for sampling (eg, 1 to 4, 4 to 7, 7 to 9, 9 to 3, etc. through 10) rather than the ascending ID_First.现在横断面按照采样所需的顺序(例如,1 到 4、4 到 7、7 到 9、9 到 3 等到 10)而不是升序的 ID_First。

Question问题

Is there an easy way to reorder the original data frame using ID_First equal to 1 as the standpoint and then, following the progression of ID_Next to ID_Tirst to ID_Next to arrange the remainder of the transects?是否有一种简单的方法可以使用 ID_First 等于 1 作为立场重新排序原始数据帧,然后按照 ID_Next 到 ID_Tirst 到 ID_Next 的进展来安排剩余的横断面?

You can accomplish this for your specific example using a while loop and the match() function in R.您可以使用 while 循环和 R 中的match() function 为您的特定示例完成此操作。 I also used list.append() from the rlist package.我还使用了rlist package 中的list.append()

library(rlist)

# sequential ID for a small number of randomly generated transects 
ID_First <- seq(1,10,1)

# represents the next transect that should be sampled following ID_First
ID_Next <- c(4,5,8,7,10,2,9,6,3,NA)


# make a dataframe
df <- cbind.data.frame(ID_First, ID_Next)

#create while loop to define target order
i = 1
order = c(i)

n = 1
while (n < length(df$ID_Next)){
  j = df[df$ID_First == i, 2]
  order = list.append(order, j)
  i = j
  n = n+1
}

#match df order to target order
df2 = df[match(order, df$ID_First),]

You can use Reduce with match to find the chain from ID_First to ID_Next .您可以使用Reducematch来查找从ID_FirstID_Next的链。

df[Reduce(function(i,j) match(df$ID_Next[i], df$ID_First)
 , seq_len(nrow(df)), accumulate = TRUE),]
#   ID_First ID_Next
#1         1       4
#4         4       7
#7         7       9
#9         9       3
#3         3       8
#8         8       6
#6         6       2
#2         2       5
#5         5      10
#10       10      NA

Data:数据:

df <- data.frame(ID_First = 1:10, ID_Next = c(4,5,8,7,10,2,9,6,3,NA))
df
#   ID_First ID_Next
#1         1       4
#2         2       5
#3         3       8
#4         4       7
#5         5      10
#6         6       2
#7         7       9
#8         8       6
#9         9       3
#10       10      NA

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何根据来自2个以上其他数据帧的条件从数据帧填充空列,所有数据帧都具有不同的长度? - How to fill an empty column from a dataframe, based on conditions from more than 2 other dataframes, all with different lengths? R - 根据单个列的多个条件从 dataframe 中删除行 - R - Removing rows from dataframe based on multiple conditions for a single column 如何根据条件在R中重新排列数据帧行中的元素? - How to rearrange elements in rows of a dataframe in R based on conditions? 如何根据R中另一个dataframe中的列删除列dataframe中的行? - How to delete rows in a column dataframe based on the column in another dataframe in R? 如何根据R中的多个条件定义一个新的dataframe列? - How to define a new dataframe column based on multiple conditions in R? 如何将此 dataframe(超过 2 行)转换为 R 中的事务? - How can I transform this dataframe (with more than 2 rows) to a transaction in R? R:根据降序和对数据框中的列名进行重新排序 - R: Reorder column names in a dataframe based on descending sum 根据条件更改 r 中的值列数据框 - change value column dataframe in r based on conditions 根据条件更新R数据框列 - Update R dataframe column based on conditions 根据 R 中的多列条件删除行 - Dropping rows based on multiple column conditions in R
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM