[英]Finding Matches in a Directed Network in R
I am looking at a directed data set using iGraph based on people who follow one another on Twitter. 我正在使用基于iGraph的定向数据集,该数据集基于在Twitter上彼此关注的人。 I have a data table 我有一个数据表
follower_user_id | followed_user_id | gender_of_follower | gender_of_followed
1 | 2 | F | M
2 | 3 | M | M
3 | 2 | M | M
and so on.. 等等..
I would like to assess the matches, so in which situations do the users mutually follow one another so that I can further look at who was not followed back by anyone and whether men for example are more likely to be followed back than women (etc. ). 我想评估一下比赛,因此在哪些情况下用户会互相关注,以便我进一步查看谁没有被谁追随,以及例如男人是否比女人更容易被追随(等等。 )。 But I am not sure how to filter down for the matches in the first place. 但是我不确定如何首先过滤掉比赛。
So far, I think the best way to do this is by using a matrix of all user ids against all user ids and counting the time each pair appears together. 到目前为止,我认为做到这一点的最佳方法是使用所有用户ID相对于所有用户ID的矩阵,并计算每对出现在一起的时间。
M <- table (df$follower_user_id, df$followed_user_id)
follower.followed.matrix <- M %*% t(M)
XXXXX 1 2 3
1 0 1 0
2 0 0 1
3 0 1 0
But I am unsure of how to merge ones where converses are combined (eg where 2-3 pairing = 2) Is it possible to use the 'reshape2'
package to melt for a directed network? 但是我不确定如何在合并了逆向词的情况下合并(例如2-3对= 2的地方),是否可以使用'reshape2'
包来融合定向网络?
I am thinking the best way to do it is in this way and then merging gender data into a new data.table of matches but I am open to suggestions for how to filter for this data in a more efficient way. 我认为最好的方法是通过这种方式,然后将性别数据合并到新的match.table中,但是我对如何更有效地过滤此数据的建议持开放态度。 I am still new to R so any help is appreciated. 我对R还是陌生的,因此可以提供任何帮助。
Getting the matrix that you need - the adjacency matrix - is built into igraph. igraph内置了获取所需矩阵(邻接矩阵)的功能。 I will use a slightly bigger example than yours to make sure that the solution handles all cases. 我将使用比您大一些的示例,以确保该解决方案能够处理所有情况。
FOL = read.table(text="follower_user_id followed_user_id gender_of_follower gender_of_followed
1 2 F M
2 3 M M
3 2 M M
4 2 M M
2 4 M M
2 5 M F
4 5 M F",
header=TRUE)
## turn it into a graph and compute adjaceny
g = graph_from_edgelist(as.matrix(FOL[,1:2]))
AM = as.matrix(as_adjacency_matrix(g))
Now with the adjacency matrix you can quickly compute the pairs for which A follows B and B follows A. 现在,通过邻接矩阵,您可以快速计算A跟随B和B跟随A的对。
sapply(1:5, function(x) { AM[x,] * AM[,x] })
[,1] [,2] [,3] [,4] [,5]
[1,] 0 0 0 0 0
[2,] 0 0 1 1 0
[3,] 0 1 0 0 0
[4,] 0 1 0 0 0
[5,] 0 0 0 0 0
You can see that the needed pairs are (2,3), (2,4), (3,2) and (4,2). 您可以看到所需的对是(2,3),(2,4),(3,2)和(4,2)。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.