Given a table with a column of non-unique items (here, Name) and two more columns: one with labels (Group) that are unique only within the group of the first columns, and a third containing values (Delay), I'm trying to end up with a table containing columns for Name and Group, and a third column with the Group of the next-highest Delay value. If there is no next-highest, then there is no row.
So for a dataset such as:
df = data.frame(
Name = c('lorem', 'lorem', 'lorem', 'lorem', 'lorem', 'ipsum', 'ipsum', 'ipsum', 'ipsum', 'ipsum', 'dolor', 'dolor', 'dolor', 'dolor', 'dolor', 'dolor', 'amet', 'amet', 'amet', 'amet', 'amet', 'amet'),
Group = c('E', 'D', 'C', 'B', 'A', 'E', 'D', 'C', 'B', 'A', 'C', 'A', 'B', 'D', 'F', 'E', 'C', 'A', 'B', 'D', 'F', 'E'),
Delay = c(5, 32, 59, 86, 113, 0, 27, 54, 81, 108, 10, 37, 64, 91, 111, 118, 0, 27, 54, 81, 101, 108)
)
Name Group Delay
1 lorem E 5
2 lorem D 32
3 lorem C 59
4 lorem B 86
5 lorem A 113
6 ipsum E 0
7 ipsum D 27
8 ipsum C 54
9 ipsum B 81
10 ipsum A 108
11 dolor C 10
12 dolor A 37
13 dolor B 64
14 dolor D 91
15 dolor F 111
16 dolor E 118
17 amet C 0
18 amet A 27
19 amet B 54
20 amet D 81
21 amet F 101
22 amet E 108
The desired output would be (although keeping the higher Delay value for each pair wouldn't hurt):
Name Source Target
lorem E D
lorem D C
lorem C B
ipsum B A
ipsum E D
ipsum D C
ipsum C B
ipsum B A
dolor C A
dolor A B
dolor B D
dolor D F
dolor F E
amet C A
amet A B
amet B D
amet F E
Ultimately, this will go into a sankeyNetwork graph using the networkD3 package.
I did try the following, looking at the next row for a match in the Name (after sorting), although this didn't work as expected on my actual data but also does nothing on the dummy data:
l = data.frame(Name = character(), From = character(), Target = character())
for(i in 1:(nrow(df) - 1)){
if(df$Name[i] == df$Name[i + 1])
{
From = as.character(df$Group[i])
Target = as.character(df$Group[i + 1])
Name = as.character(df$Name[i])
}
links = rbind(l, list(Name = as.character(Name), From = as.character(From), Target = as.character(Target)))
}
We can do a group by and take the lead
library(dplyr)
df %>%
group_by(Name) %>%
transmute(Source = Group,
Target = lead(Group, order_by = Delay),
Value = lead(Delay, order_by = Delay)) %>%
na.omit
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.