Here is the sample of the data I'm using in the analysis. What I need to do is to extract top 3 values for each of the rows, with column names. For example, this would be an output for the first 3 rows:
id, group1, weight1, group2, weight2, group3, weight3
1, V4, 0.277991043, V10, 0.050863724, V2, 0.033589251
2, V5, 0.164107486, V4, 0.119961612, V3, 0.098208573
3, V3, 0.124760077, V5, 0.089891235, V2, 0.071337172
What would be an easiest way to do so?
Here's another idea that would keep the data in a tidy format:
library(dplyr)
library(tidyr)
sample %>%
gather(key, value, -node) %>%
group_by(node) %>%
top_n(3) %>%
# here we use arrange() to sort by node and value
arrange(node, desc(value))
Which gives:
#Source: local data frame [75 x 3]
#Groups: node [25]
#
# node key value
# <int> <chr> <dbl>
#1 1 V4 0.27799104
#2 1 V10 0.05086372
#3 1 V2 0.03358925
#4 2 V5 0.16410749
#5 2 V4 0.11996161
#6 2 V3 0.09820857
#7 3 V3 0.12476008
#8 3 V5 0.08989123
#9 3 V2 0.07133717
#10 4 V6 0.20665387
#.. ... ... ...
Should you really want to achieve your desired output, you could do:
sample %>%
gather(key, value, -node) %>%
group_by(node) %>%
top_n(3) %>%
arrange(node, desc(value)) %>%
mutate(group = paste0("group", row_number()),
weight = paste0("weight", row_number())) %>%
spread(group, value) %>%
spread(weight, key) %>%
summarise_each(funs(max(., na.rm = TRUE)))
Which gives:
#Source: local data frame [25 x 7]
#
# node group1 group2 group3 weight1 weight2 weight3
# <int> <dbl> <dbl> <dbl> <chr> <chr> <chr>
#1 1 0.2779910 0.05086372 0.033589251 V4 V10 V2
#2 2 0.1641075 0.11996161 0.098208573 V5 V4 V3
#3 3 0.1247601 0.08989123 0.071337172 V3 V5 V2
#4 4 0.2066539 0.14747281 0.121561100 V6 V2 V10
#5 5 0.2773512 0.21849008 0.158989123 V1 V8 V3
#6 6 0.1509917 0.11964171 0.117722329 V9 V3 V10
#7 7 0.2415227 0.13595649 0.130838132 V9 V7 V8
#8 8 0.1090851 0.10588612 0.088611644 V9 V7 V5
#9 9 0.1868202 0.11548305 0.089571337 V10 V1 V6
#10 10 0.3429303 0.12955854 0.003838772 V5 V6 V11
#.. ... ... ... ... ... ... ...
We can use apply
res <- cbind(df1[1], t(apply(df1[-1], 1, function(x) {
i1 <- order(-x)
c(rbind(names(df1)[-1][i1][1:3], x[i1][1:3]))}
)))
Then, we can do the type conversion
res[] <- lapply(res, function(x) {x1 <- type.convert(as.character(x))
if(is.factor(x1)) as.character(x1) else x1})
names(res)[-1] <- make.unique(rep(c("group", "weight"), (ncol(res)-1)/2))
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.