[英]How can I calculate network measures separately for different periods using igraph in r?
This is my transaction data:这是我的交易数据:
data:
id from_id to_id amount date_trx
<fctr> <fctr> <fctr> <dbl> <date>
0 7468 5695 700.0 2005-01-04
1 6213 9379 11832.0 2005-01-08
2 7517 8170 1000.0 2005-01-10
3 6143 9845 4276.0 2005-01-12
4 6254 9640 200.0 2005-01-14
5 6669 5815 200.0 2005-01-20
6 6934 8583 49752.0 2005-01-24
7 9240 8314 19961.0 2005-01-26
8 6374 8865 1000.0 2005-01-30
9 6143 6530 13.4 2005-01-31
...
I formed the network where the edges are formed between the nodes(accounts) from_id
's and to_id
's, and the weights of the edges determined by the amounts they transact.我形成了在from_id
和to_id
的节点(帐户)之间形成边的网络,边的权重由它们交易的数量决定。 Then I calculated the network's measures such as degree centrality, betweenness centrality, closeness centrality etc.然后我计算了网络的度量,例如度中心性、介数中心性、接近中心性等。
That is:那是:
relations <- data.frame(from = data$from_id,
to = data$to_id)
network <- graph_from_data_frame(relations, directed = T)
E(network)$weight <- data$amount
V(network)$degree <- degree(network, normalized=TRUE)
V(network)$betweenness <- betweenness(network, normalized=TRUE)
V(network)$closeness <- closeness(network, normalized=TRUE)
But now I want to calculate these measures periodically.但现在我想定期计算这些措施。 For example, I want to divide my data by weeks(starting from the very first transaction date) and calculate the network measures for each account for corresponding weeks.例如,我想将我的数据除以周(从第一个交易日期开始),并计算每个账户对应周的网络度量。
data$week <- unsplit(tapply(data$date_trx, data$from_id, function(x) (as.numeric(x-min(data$trx_date)) %/% 7)+1),data$from_id)
select(data, from_id, to_id, date_trx, week, amount) %>% arrange(date_trx)
from_id to_id date_trx week amount
<fctr> <fctr> <date> <dbl> <dbl>
6644 6934 2005-01-01 1 700
6753 8456 2005-01-01 1 600
9242 9333 2005-01-01 1 1000
9843 9115 2005-01-01 1 900
7075 6510 2005-01-02 1 400
8685 7207 2005-01-02 1 1100
... ... ... ... ...
9866 6697 2010-12-31 313 95.8
9866 5992 2010-12-31 313 139.1
9866 5797 2010-12-31 313 72.1
9866 9736 2010-12-31 313 278.9
9868 8644 2010-12-31 313 242.8
9869 8399 2010-12-31 313 372.2
As I divided my data into weekly periods, now I need to form networks of accounts for each week separately and so that I can calculate network measures for accounts in weekly periods.由于我将数据划分为每周期间,现在我需要分别形成每周的账户网络,以便我可以计算每周期间账户的网络度量。 How can I do that for 313 weeks and at once?我怎样才能一次完成 313 周?
One possibility is splitting your data according to week, transform each week into an igraph object and then add the centralities and degree to all graphs at once, using lapply.一种可能性是根据周拆分数据,将每周转换为 igraph object,然后使用 lapply 一次将中心性和度数添加到所有图表。 My initial data.frame is named d (see below):我的初始 data.frame 被命名为 d (见下文):
library(igraph)
head(d)
from_id to_id weight date_trx
1 D I 8 1999-09-12
2 E H 10 1999-10-20
3 A G 10 1999-09-10
4 C G 13 1999-04-15
5 E J 9 1999-06-26
6 B F 15 1999-04-30
First get the week:首先获取星期:
d$week <- strftime(d$date_trx, format = "%V")
Now split by week:现在按周划分:
dd <- split(d, d$week )
Transform each week into an igraph
将每周转换为igraph
dd <- lapply(dd, function(x) graph_from_data_frame(x, directed = T))
Write a function that does all the operations you want to carry out, and then apply it to each graph:编写一个 function 来执行您要执行的所有操作,然后将其应用于每个图形:
my.funct <- function(x) {
V(x)$degree <- degree(x, normalized=TRUE)
V(x)$betweenness <- betweenness(x, normalized=TRUE)
V(x)$closeness <- closeness(x, normalized=TRUE)
return(x)
}
dd <- lapply(dd, my.funct)
For instance, for the first week:例如,第一周:
dd[[1]]
IGRAPH f515e52 DN-- 4 2 --
+ attr: name (v/c), degree (v/n), betweenness (v/n), closeness (v/n), weigth (e/n), date_trx
| (e/n), week (e/c)
+ edges from f515e52 (vertex names):
[1] B->F C->G
get.vertex.attribute(dd[[1]])
$name
[1] "B" "C" "F" "G"
$degree
[1] 0.3333333 0.3333333 0.3333333 0.3333333
$betweenness
[1] 0 0 0 0
$closeness
[1] 0.3333333 0.3333333 0.2500000 0.2500000
get.edge.attribute(dd[[1]])
$weight
[1] 9 7
$date_trx
[1] 10595 10601
$week
[1] "01" "01"
You can then retrieve all centralities and degree for all weeks:然后,您可以检索所有周的所有中心性和度数:
ddd <- lapply(dd, function(x) igraph::as_data_frame(x, what = "vertices") )
# keep in mind that `split` names the objects in the list according to
# the value it used to split, therefore the name of the data.frames in
# the list is the name of the week.
library(dplyr)
ddd <- bind_rows(ddd, .id="week")
head(ddd)
week name degree betweenness closeness
1 01 E 1.4444444 0 0.2000000
2 01 D 1.5555556 0 0.1666667
3 01 B 0.7777778 0 0.2000000
4 01 A 1.0000000 0 0.2000000
5 01 C 0.7777778 0 0.1666667
6 01 F 1.0000000 0 0.1000000
In case, you can use this to merge back to the original edges list.万一,您可以使用它来合并回原始边缘列表。
Data used in this example:本例中使用的数据:
set.seed(123)
d <- data.frame(from_id = sample(LETTERS[1:5], 2000, replace = T),
to_id = sample(LETTERS[6:10], 2000, replace = T),
weight = rpois(2000, 10),
date_trx = sample(seq(as.Date('1999/01/01'), as.Date('2000/01/01'), by="day"), 2000, replace = T))
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.