R中的两个数据帧上的lapply（）和spline（），不合并

Question

I have two data frames (df, df5) with shared factor level ("Auction_ID"). 我有两个具有共享因子级别（“ Auction_ID”）的数据帧（df，df5）。 so df has num.bidders and res.bid and Auction_ID. 因此df有num.bidders和res.bid和Auction_ID。 df5, has bid.points, Auction_ID. df5，具有bid.points，拍卖ID。

I used smooth.splines() function to get spline estimates, and I saved it as new column in df (I am not sure if I should save it in df5) 我使用smooth.splines（）函数获取样条估计，并将其另存为df中的新列（不确定是否应将其保存在df5中）

    spline  <- smooth.spline(df$c_bidders,df$res.bid)

the question is how to use predict() function on df$spline1 and df5$bid.points for each level. 问题是如何在每个级别的df $ spline1和df5 $ bid.points上使用predict（）函数。 I tried to use lapply and send df,df5 as input data for function, but seems I can't do it. 我尝试使用lapply并将df，df5发送为函数的输入数据，但似乎无法做到这一点。 like: 喜欢：

 lapply(df,df5, function(t,t1)
   {
    tt<-predict(t$spline,t1$bid.points,deriv=0)$y 
   return(tt)}
    )

I dont know if I introduce a list variable, will this help? 我不知道是否引入列表变量，这会有所帮助吗？

if I use merge(df,df5,by="Auction_ID") then I am ending up very large data frame: 如果我使用merge（df，df5，by =“ Auction_ID”），那么我将结束非常大的数据帧：

   str(df1):
   Classes ‘tbl_df’, ‘tbl’ and 'data.frame':    3967 obs. of  17 variables:

   str(df5)
   'data.frame':    18338 obs. of  2 variables:

    x <- merge(df5, df1, by = "Auction_ID")
    str(x)
    'data.frame':   501367 obs. of  19 variables:

( merge() with "all" options are already tried. like all.y = TRUE ... gives the same number of obs. which is not good for my purpose. （已经尝试过带有“ all”选项的merge（）。像all.y = TRUE ...给出相同数量的obs。这对我的目的不利。

Answer 1

Is the issue that you don't want to deal with the large df with 50k rows? 您是否不想处理具有5万行的大型df的问题？

Maybe a merge (aka join) isn't what you need. 也许合并（又称联接）不是您所需要的。 Perhaps you just need to use the "match" function to essentially perform a vlookup and match each value of df$spline1 to each corresponding value of df5$bid.points (based on auction ID). 也许您只需要使用“ match”函数本质上执行vlookup并将df $ spline1的每个值与df5 $ bid.points的每个对应值进行匹配（基于拍卖ID）。

See if this works for your purposes: 看看这是否适合您的目的：

# assuming df5 is the target df:
df5$spline1 <- df$spline1[match(df$Auction_ID,df5$Auction_ID)]

## OR

# assuming df is the target df:
df$bid.points <- df5$bid.points[match(df$Auction_ID,df5$Auction_ID)]

Answer 2

Consider using Map to pass both dataframes which returns a list of values returned from predict() : 考虑使用Map传递这两个数据帧，这两个数据帧返回从predict()返回的值的列表：

List return 清单返回

Map(function(t, t1) predict(t$spline, t1$bid.points,deriv=0)$y, df, df5)

Above would be equivalent to passing the second dataframe as a third argument in lapply() : 以上等同于将第二个数据帧作为第三个参数传递给lapply() ：

lapply(df, function(t,t1) { 
     predict(t$spline, t1$bid.points, deriv=0)$y
}, df5)

Matrix Return 矩阵回报

Alternatively, using sapply() which returns a matrix: 或者，使用sapply()返回一个矩阵：

sapply(df, function(t,t1) { 
     predict(t$spline, t1$bid.points, deriv=0)$y
}, df5)

Or mapply() the base function behind Map() (its non-simplified wrapper) 或mapply() Map()背后的基本函数（非简化包装器）

mapply(function(t,t1) predict(t$spline, t1$bid.points, deriv=0)$y, df, df5)

R中的两个数据帧上的lapply（）和spline（），不合并

问题描述

2 个解决方案

解决方案1
0 已采纳 2016-12-26 01:51:00

解决方案2
0 2016-12-26 16:30:36

R中的两个数据帧上的lapply（）和spline（），不合并

问题描述

2 个解决方案

解决方案1 0 已采纳 2016-12-26 01:51:00

解决方案2 0 2016-12-26 16:30:36

解决方案1
0 已采纳 2016-12-26 01:51:00

解决方案2
0 2016-12-26 16:30:36