简体   繁体   中英

ddply + summarise function column name input

I am trying to use ddply and summarise together from the plyr package but am having difficulty parsing through column names that keep changing...In my example i would like something that would parse in X1 programatically rather than hard coding in X1 into the ddply function.

setting up an example

require(xts)
require(plyr)
require(reshape2)
require(lubridate)
t <- xts(matrix(rnorm(10000),ncol=10), Sys.Date()-1000:1)
t.df <- data.frame(coredata(t))
t.df <- cbind(day=wday(index(t), label=TRUE, abbr=TRUE), t.df)
t.df.l <- melt(t.df, id.vars=c("day",colnames(t.df)[2]), measure.vars=colnames(t.df)[3:ncol(t.df)])

This is the bit im am struggling with....

cor.vars <- ddply(t.df.l, c("day","variable"), summarise, cor(X1, value))

i do not want to use the term X1 and would like to use something like

cor.vars <- ddply(t.df.l, c("day","variable"), summarise, cor(colnames(t.df)[2], value))

but that comes up with the error: Error in cor(colnames(t.df)[2], value) : 'x' must be numeric

I also tried various other combos that parse in the vector values for the x argument in cor...but for some reason none of them seem to work...

any ideas?

Although this is probably not the intended usage for summarize and there must be much better approaches to your problem, the direct answer to your question is to use get :

ddply(t.df.l, c("day","variable"), summarise, cor(get(colnames(t.df)[2]), value))

Edit: here is for example one approach that is in my opinion better suited to your problem:

ddply(t.df.l, c("day", "variable"), function(x)cor(x["X1"], x["value"]))

Above, "X1" can be also replaced by 2 or the name of a variable holding "X1" , etc. It depends how you want to programmatically access the column.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM