简体   繁体   中英

Using a graph-tool efficiently

After a long thought, I finally decided to post this question here. Few days back I started using graph-tool to do various things. I have been using Networkx before that. I have already seen the impressive performance comparision and thought that everything would be simple enough. However, I immediately ran into the speed issue and asked a question related to a particular aspect of it. I got a quick answer that satisfied me. However, now this speed issue is bugging me every now and then and I can't find any documentation about graph-tool that is related to efficiently using it. For example, from the answer to my last question, I came to realize that it is better to add all edges together instead of one by one which is a very important point to note but hasn't been mentioned anywhere! I am now having two more similar issues:

(1) How do I choose a random neighbour of a given node? I can only see the following solution:

nbr = np.random.choice(list(v.all_neighbours()))

since v.all_neighbours() is a generator, I must convert it into the list to choose a random element. This slows down the code but I don't see any better way.

(2) I want to assign a 1d vector (is list okay?) to each of the vertices in the graph and later I am exchanging and modifying them in a particular way. This is simply a property map and I would like to see some documentation about how to use this efficiently. However, I can't locate anything.

(3) I am trying to simulate a triadic closure in some network which itself is changing with time. Thus, at every time step, I need an information about the neighbours of each vertex in the graph. Again, I must create a list (or numpy array):

nbrs = [w for w in v.neighbours()]

which decreases the speed of my code substantially. This means that I am not doing this correctly but I couldn't find any documentation that would tell me how to use neighbours efficiently in graph-tool.

Somehow Networkx programs that I have written for same tasks have completely outperformed the graph-tool codes I simply can't buy this.

This list might increase and hence I would be very glad if somebody could point me to some documentation about using graph-tool efficiently apart from answering the above mentioned specific questions.

Thanks in advance.

I'll try to make more " graph-tool -specific" answers:

1) well actually this one is related to python, so you might want to use the solution from this thread using random.shuffle However, if you are going to do this repeatedly, (and if you have enough available memory), I think it might be better to get the adjacency matrix as a scipy sparse matrix then use that matrix:

import graph_tool
from graph_tool.spectral import adjacency
import numpy as np

g = graph_tool.Graph()
g.add_vertex(100)
edges = np.random.randint(0, 100, (500,2))
g.add_edge_list(edges)

mat = adjacency(g).tolil()
rows = mat.rows.copy() # mat.rows = in-neighbours for each vertex
map(np.random.shuffle, rows)
rnd_neighbours = { i:row[0] for i, row in enumerate(rows) if row }

Where rnd_neighbours contains the index of one random neighbour for each vertex of nonzero in-degree.

2) reading the graph-tool documentation regarding PropertyMaps and the detailed version , list s are accepted as python::object or simply object . You can then access them as elements in the PropertyMap .

3) For triadic closure, look at the clustering module and at this thread on the mailing list.

EDIT: by the way, I forgot to mention it, but you can access and change the number of OpenMP threads with openmp_enabled , openmp_get_num_threads , and openmp_set_num_threads in graph-tool . Can't find it in the doc, though... But here is the source .

You can access neighbours and vertices as arrays, which will speedup your code, as described in the documentation: https://graph-tool.skewed.de/static/doc/quickstart.html#fast-iteration-over-vertices-and-edges

For example, instead of doing:

nbr = np.random.choice(list(v.out_neighbours()))

you should do:

nbr = np.random.choice(g.get_out_neighbours(v))

which should be substantially faster, as arrays are used instead of lists.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM