简体   繁体   中英

Pandas to bipartite graph

I have already added nodes into my graph but i cant seem to understand the way to add the edges to it. The edges correspond to any value of 1 in my pivot tabel. The table is of the following form:

movie_id  1     2     3     4     5     ...  500
user_id                                 ...                              
501       1.0   0.0   1.0   0.0   0.0  ...   0.0  
502       1.0   0.0   0.0   0.0   0.0  ...   0.0   
503       0.0   0.0   0.0   0.0   0.0  ...   1.0   
504       0.0   0.0   0.0   1.0   0.0  ...   0.0  
.         ...
.

1200

This is the code i have used for my nodes:

B = nx.Graph()
B.add_nodes_from(user_rating_pivoted.index, bipartite=0)
B.add_nodes_from(user_rating_pivoted.columns, bipartite=1)

And i imagine the edges should be formed in a similar way:

add_edges_from(...) for idx, row in user_rating_pivoted.iterrows())

Let's add prefixes to those indices and columns, and use them as nodes to more easily associate the connections:

print(df)

          movie_1  movie_2  movie_3  movie_4  movie_5  movie_6
user_1      1.0      1.0      1.0      1.0      0.0      0.0
user_2      1.0      0.0      0.0      0.0      0.0      0.0
user_3      0.0      1.0      0.0      0.0      0.0      1.0
user_4      1.0      0.0      1.0      0.0      1.0      0.0

In order to get the edges ( and keep the node names ) we could use pandas to transform a little the dataframe. We can get a MultiIndex using stack , and then indexing on the values that are 1 .Then we can use add_edges_from to add all the edge data:

B = nx.Graph()
B.add_nodes_from(df.index, bipartite=0)
B.add_nodes_from(df.columns, bipartite=1)

s = df.stack()
B.add_edges_from(s[s==1].index)

We can use bipartite_layout for a nice layout of the bipartite graph:

top = nx.bipartite.sets(B)[0]
pos = nx.bipartite_layout(B, top)

nx.draw(B, pos=pos, 
        node_color='lightgreen', 
        node_size=2500,
        with_labels=True)

在此处输入图像描述

Note that it is likely that these highly sparse matrices lead to disconnected graphs though, ie graphs in which not all nodes are connected to some other node, and attempting to obtain both sets will raise an error as specified here .

AmbiguousSolution – Raised if the input bipartite graph is disconnected and no container with all nodes in one bipartite set is provided. When determining the nodes in each bipartite set more than one valid solution is possible if the input graph is disconnected.

In such case you can just plot as a regular graph with:

rcParams['figure.figsize'] = 10 ,8
nx.draw(B, 
        node_color='lightgreen', 
        node_size=2000,
        with_labels=True)

在此处输入图像描述

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM