简体   繁体   English

使用返回向量的函数中的行创建一个data.frame

[英]Creating a data.frame with rows from a function that returns vectors

I have data in a number of separate csv files, and I want to create a data.frame with a row for each file. 我在多个单独的csv文件中都有数据,我想为每个文件创建一个带一行的data.frame。 The function below delivers the data to be used for each row. 下面的函数提供了每一行要使用的数据。 I don't want to change this code, eg include the farmid part of the output vector. 我不想更改此代码,例如在输出向量中包含Farmid部分。

vectorfromfile <- function(farmid) {
    # Reads data from a file named farm{id}.csv, eg 
    # farm001.csv, and returns one named vector
    # of length two with class numeric and names 'apples' 
    # and 'oranges' An example could be c(apples=4, oranges=6)

    # The line below is a dummy for test purposes
    c(apples=farmid+1000, oranges=farmid+2000)
}

I then have a vector, farmids, eg farmids <- c(1,3,5). 然后,我有一个向量,即Farmid,例如Farmids <-c(1,3,5)。 I need to create a data frame with three columns: id, apples and oranges, and a row for each of the farmids. 我需要创建一个包含三列的数据框:id,苹果和橘子,以及每个Farmid的一行。 It should look like the data.frame defined below. 它看起来应该像下面定义的data.frame。

> data.frame(id=c(1,3,5), apples=c(4,2,3), oranges=c(6,5,2) )
  id apples oranges
1  1      4       6
2  3      2       5
3  5      3       2

I have found several ways of doing this, all of them quite ugly and taking up many lines. 我发现这样做的几种方法,它们都非常丑陋,占用很多行。 But I would like to do it in the most elegant way, using the split-apply-combine approach. 但我想使用拆分应用合并方法以最优雅的方式做到这一点。 So I hope I can simply apply to (iterate over) a vector, and get a data.frame as result. 所以我希望我可以简单地申请(迭代)一个向量,并得到一个data.frame作为结果。 Something like 就像是

apply(farmids, ???? ) # farmids is  a vector

Is that possible? 那可能吗? If not, then perhaps iterating over a list with the same values? 如果没有,那么也许遍历具有相同值的列表? And if even that is not possible, what would then be the most elegant way. 如果不可能,那是最优雅的方式。

My ugly attempts below 我下面的丑陋尝试

vect2df_v1 <- function(farmids=c(1,3,5)) {
    df <- data.frame(id=farmids, apples=rep(NA, length(farmids)), oranges=rep(NA, length(farmids)))
    for (i in 1:length(farmids)) {
       df[i, c('apples', 'oranges')] <- vectorfromfile(df[i, 'id'])
    }
    df
}

vect2df_v2 <- function(farmids=c(1,3,5)) {
    # Obviously it could be written into one (even uglier) line
    farmrow <- function(farmid) { c(farmid, vectorfromfile(farmid)) }
    lst <- lapply(farmids, farmrow)
    mtrx <- matrix(unlist(lst), ncol=3, byrow=T, dimnames=list(NULL,c('id', 'apples','oranges')))
    data.frame(mtrx)
}

This is simple with do.call(rbind, ...) . 使用do.call(rbind, ...)很简单。

You can write your vect2df like this: 您可以这样编写vect2df

vect2df <- function(vec) {
  data.frame(id = vec, do.call(rbind, lapply(vec, vectorfromfile)))
} 

Demo: 演示:

vect2df(c(1, 3, 5))
#   id apples oranges
# 1  1   1001    2001
# 2  3   1003    2003
# 3  5   1005    2005

Of course, this could all be done pretty directly just using within (if vectorfromfile is not a critical function but can be defined simply. 当然,只需使用within ,就可以直接完成所有操作(如果vectorfromfile不是关键函数,但可以简单定义。

Example: 例:

within(data.frame(id = c(1, 3, 5)), {
  oranges <- id + 2000
  apples <- id + 1000
})
#   id apples oranges
# 1  1   1001    2001
# 2  3   1003    2003
# 3  5   1005    2005

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM