简体   繁体   中英

How to get rid of duplicates in a graph

I'm building a social graph from a list of tuples 'friends' like this:

(4118181 {'last_name': 'Belousov', 'first_name': 'Mikhail'})

Here's the function:

def addToGraph (g, start, friends):
    g.add_nodes_from(friends)
    egdes_to_add = [(start, entry[0]) for entry in friends]
    g.add_edges_from(edges_to_add)
    return g

As a result I get a graph with duplicated amount of nodes, the 1st with attributes, coming from

g.add_nodes_from(friends)

and the second is from

 g.add_edges_from(edges_to_add)

I read the docs, but can't figure out, how can I add both nodes with attributes and edges between those nodes?

So your function adds edges between the node start and every node in friends . I tried your code and I don't get any duplicate nodes. Here is my full example (note that I corrected a couple of errors in your code).

import networkx as nx

friends = [
    (4118181, {'last_name': 'Belousov', 'first_name': 'Mikhail'}),
    (1111111, {'last_name': 'A', 'first_name': 'B'}),
    (2222222, {'last_name': 'C', 'first_name': 'D'}),
    (3333333, {'last_name': 'E', 'first_name': 'F'})
]

def addToGraph(g, start, friends):
    g.add_nodes_from(friends)
    edges_to_add = [(start, entry[0]) for entry in friends]
    g.add_edges_from(edges_to_add)

G = nx.Graph()
addToGraph(G, 4118181, friends)

print('Nodes:', G.nodes())
print('Edges:', G.edges())

Output:

Nodes: [3333333, 4118181, 2222222, 1111111]
Edges: [(3333333, 4118181), (4118181, 4118181), (4118181, 2222222), (4118181, 1111111)]

Your nodes are integers. Your edges are strings. When you add the nodes, it adds a bunch of nodes whose names are integers. When it adds an edge, it sees a new edge between the strings '4118181' and '340559596' . Python sees those as distinct from the integers, so it creates new nodes with the new names and puts an edge between them.

To fix this, you'll need to convert the strings to integers before adding the edges.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM