I have a graph, have computed the PageRank of its vectors, and would now like to compute clusters for the 20 nodes with highest PageRank. I am using graph-tool and networkx so far.
Is there a known and practical way to do this?
Since your question is a bit vague, I'll try answer supposing that you are looking for a way to get the central cluster of your document collection. On this picture, the central 5 item cluster would be [B,C,E,F,D]
In pseudo code slightly pytonic, would that be something like this?
n = 0
center = node.with_highest_rank()
cluster = {center: {}}
current_connexion = center
while n<20:
main_connexion = node.citing_node_with_higher_rank(current_connexion).filter(not in cluster.keys())
cluster["center"] = {main_connexion: {}}
n += 1
# if ranks are higher on connexion level 2 than the next node on level 1, look down
if node.citing_node_with_higher_rank(main_connexion).rank > node.citing_node_with_higher_rank(current_connexion).rank:
current_connexion = main_connexion
Advice : on stack overflow, the public is typically developers. Developers need concrete use case, concrete code and precise definition. If you have more general, theoretical / scientific question (typically, here, graph theory), have a look at other communities such as Computer Sciences
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.