简体   繁体   中英

Applying function with two inputs to groups of data in dataframe

I have created a function (pearson) that calls on two vectors (vlow and vhigh) and would like to apply that function to groups of data (dist) in my dataframe (ldf). I have tried to use the following code:

ldf %>% group_by(dist) %>% summarize(pearson(vlow,vhigh))

This is the output:

pearson(vlow, vhigh)
    1            0.5686079

With 5 groups I should be getting 5 results, but for whatever reason its not identifying the groups correctly. Here is what the structure of the dataframe looks like. Any suggestions as to how I could fix this?

'data.frame':   157 obs. of  5 variables:
 $ dlow : num  24 33 45 123 30 33 126 84 87 81 ...
 $ dhigh: num  27 36 48 126 33 36 129 87 90 84 ...
 $ vlow : num  251 249 251 254 251 ...
 $ vhigh: num  248 250 251 254 250 ...
 $ dist : chr  "3" "3" "3" "3" ...

Best, Thomas

Found the answer to my own question. If anybody is curious this is the code to make it work:

# Split data by lag
sp <- split(ldf, ldf$dist)

# Calculate Pearson Correlation (p)
p <- lapply(names(sp), function(x) pearson(sp[[x]][["vlow"]],sp[[x]][["vhigh"]]))
p <- unlist(p)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM