简体   繁体   English

跨数据框迭代应用函数

[英]Apply function iteratively across a dataframe

I have a two-part question for applying a function across a dataset in R. 对于将函数应用于R中的数据集,我有一个分为两部分的问题。

i) Firstly, I have 2 data frames that I would like to be combined and paired iteratively, so that something like a cbind function would line up the 1st columns of each data frame next to each other, then the 2nd columns and so on. i)首先,我想将2个数据帧进行迭代组合和配对,以便像cbind函数之类的东西将每个数据帧的第一列彼此相邻排列,然后是第二列,依此类推。 In the example below, I would like an output combining df1 and df2 where the column order would be eg1, eg4, eg2, eg5, eg3, eg6. 在下面的示例中,我想要结合df1和df2的输出,其中列顺序为eg1,eg4,eg2,eg5,eg3,eg6。

eg1 <- as.data.frame(matrix(sample(0:1000, 36*10, replace=TRUE), ncol=1))
eg2 <- as.data.frame(matrix(sample(0:500, 36*10, replace=TRUE), ncol=1))
eg3 <- as.data.frame(matrix(sample(0:750, 36*10, replace=TRUE), ncol=1))
df1 <- cbind(eg1,eg2,eg3)
eg4 <- as.data.frame(matrix(sample(0:200, 36*10, replace=TRUE), ncol=1))
eg5 <- as.data.frame(matrix(sample(0:100, 36*10, replace=TRUE), ncol=1))
eg6 <- as.data.frame(matrix(sample(0:350, 36*10, replace=TRUE), ncol=1))
df2 <- cbind(eg4,eg5,eg6)

I know a manual way of doing this (below), but this would not be ideal when combining much larger datasets and I was wondering if there is a more efficient way of achieving this? 我知道手动执行此操作的方法(如下),但是在组合更大的数据集时这并不理想,我想知道是否有更有效的方法来实现此目的?

df3 <- cbind(df1,df2)
df3 <- df3[,c(1,4,2,5,3,6)]

(ii) Following this I would like to output seven values in each odd column based on the the 7 highest values in the corresponding even column. (ii)之后,我想根据相应的偶数列中的7个最高值在每个奇数列中输出7个值。 As an example, for the first two columns... 例如,对于前两列...

df4 <- df3[,1:2]
High_7 <- tail(df4[order(df4[,2]),],7)#Highest 7 values in even column
High_7 <- High_7[,1] #Select odd column values

But an example using this across the dataset, maybe through some form of apply function would be much more effective. 但是,在整个数据集中使用此示例的示例,也许通过某种形式的apply函数将更为有效。

for your first question of combining the cols of both dataframes iteratively (note that this only works if the names of both dataframes are unique, which they are NOT in your OP): 对于将两个数据框的列进行迭代组合的第一个问题(请注意,这仅在两个数据框的名称都是唯一的且它们不在您的OP中的情况下才有效):

df3 <- Reduce(cbind,
       Map(function(x, y) cbind(df1[x], df2[y]), names(df1), names(df2))) 

for the second part I would use this: 对于第二部分,我将使用以下代码:

results <- sapply(seq(1,ncol(df3),2),
                        function(i) df3[order(df3[,i+1], decreasing = TRUE), ][1:7,i])

if you want the results to be a data.frame just do: 如果您希望结果为data.frame,请执行以下操作:

results <- data.frame(results)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM