简体   繁体   中英

Storing objects in class instance dictionary has unexpected results

Apologies in advance for the lengthy post and thanks to anyone who has some time to take a look. Complete working example at end of post.

I would like some help understanding the behavior of my code. I wrote two simple graph-oriented classes, one for nodes and one for the graph itself. The graph has a dictionary to track an instance of a node according to its index , self.nodes and the node keeps a list of neighbors, self.neighbors (these self's are for Graph and Node respectively).

What's strange is that I can always get a complete list of a node's neighbors by going through the Graph instance nodes dictionary, but if I try to get neighbors of neighbors by accessing a node through another node's neighbors list, I often get a node with no neighbors, showing incorrect information. For example, once I read in and process a graph, I can print out each node and its neighbors perfectly, by calling a Graph instances's listNodes() , which gives me this for one example graph:

(i = 1, neighbors: 5 2 4 3)
(i = 2, neighbors: 1)
(i = 3, neighbors: 1 8 9)
(i = 4, neighbors: 1 9 6)
(i = 5, neighbors: 1)
(i = 6, neighbors: 4 7)
(i = 7, neighbors: 6)
(i = 8, neighbors: 3)
(i = 9, neighbors: 4 3)

So I can access the neighbors of a node when I access it directly from the self.nodes dictionary in a graph instance. However, I cannot access the neighbors of a node's neighbors via the node's list of neighbors. For example when I run printNeighborsNeighbors(3) , implemented like so:

def printNeighborsNeighbors(self, start_num):
    node = self.nodes[start_num]
    print(node.neighbors)

here is the output:

[(i = 1, neighbors: ), (i = 8, neighbors: 3), (i = 9, neighbors: )]

This indicates that node 1 and 9 have zero neighbors, but that's entirely wrong. The graph looks like this:

graphImage

Here is the ordering for the input of the neighbors:

5 1
1 2
1 4
1 3
3 8
4 9
3 9
4 6
6 7

Here are the class implementations:

class Node:   
    def __init__(self, i):
        self.index = i
        self.neighbors = []

    def createNeighbor(self, neighbor):
        self.neighbors.append(neighbor)

    def __str__(self):
        neighbors = [str(n.index) for n in self.neighbors]
        return "(i = %d, neighbors: %s)"%(self.index, " ".join(neighbors))

    def __repr__(self):
        return str(self)

and

class Graph:
    def __init__(self):
        self.nodes = defaultdict(lambda: False)

    def neighborNodes(self, node, neighbor): 
        if not self.nodes[node.index]:
            self.nodes[node.index] = node

        if not self.nodes[neighbor.index]:
            self.nodes[neighbor.index] = neighbor

        self.nodes[node.index].createNeighbor(neighbor)
        self.nodes[neighbor.index].createNeighbor(node)

    def printNeighborsNeighbors(self, start_num):
        node = self.nodes[start_num]
        print(node.neighbors)
        for n in node.neighbors:
            print(n.neighbors)

    def listNodes(self):
        for node in self.nodes.values():
            print(node)

Here's what I'm thinking:

  • This does not solely relate to the input text file left-right ordering on a per-line basis because 3 has two "bad" neighbors (where info is lost) and one was input as 1 3 and the other was input as 3 9
  • This does not solely relate to the text file input as far as the line-ordering goes because the good neighbor for 3 was input before one bad neighbor of 3 but after the other bad neighbor.
  • When I run printNeighborsNeighbors(4) , 9 and 6 have their neighbors listed correctly but 1 has nothing listed. So it seems to be an all-or-nothing error. Either you've got all the true neighbors or you just don't have a list of neighbors at all. This part is the most confusing. It's not a matter of overwriting an object, this feels more like some kind of C++ style object slicing.

I can easily get around this problem by always going through the graph dictionary, but I'd like to know what's going on here. Seems like I am misunderstanding something important about how Python handles these objects.

Thanks for any corrections or suggestions of what to try.


following MK's suggestion here's a working example:

input.txt

1
9 9
5 1
1 2
1 4
1 3
3 8
4 9
3 9
4 6
6 7
8

and I just ran this .py so it should work:

import copy
from collections import defaultdict


class Node:   
    def __init__(self, i):
        self.index = i
        self.neighbors = []
        self.currentPath = []

    def createNeighbor(self, neighbor):
        self.neighbors.append(neighbor)

    def __str__(self):
        neighbors = [str(n.index) for n in self.neighbors]
        return "(i = %d, neighbors: %s)"%(self.index, " ".join(neighbors))

    def __repr__(self):
        return str(self)

class Graph:
    def __init__(self):
        self.nodes = defaultdict(lambda: False)

    def neighborNodes(self, node, neighbor): 
        if not self.nodes[node.index]:
            self.nodes[node.index] = node

        if not self.nodes[neighbor.index]:
            self.nodes[neighbor.index] = neighbor

        self.nodes[node.index].createNeighbor(neighbor)
        self.nodes[neighbor.index].createNeighbor(node)

    def printNeighborsNeighbors(self, start_num):
        node = self.nodes[start_num]
        print(node.neighbors)
        #for n in node.neighbors:
         #   print(n.neighbors)

    def listNodes(self):
        for node in self.nodes.values():
            print(node)


f = open('input.txt', 'r')
t = int(f.readline())


for _ in range(t):

    graph = Graph()
    n, m = f.readline().split()
    n = int(n)
    m = int(m)
    for _ in range(m):
        x, y = f.readline().split()
        x = int(x)
        y = int(y)
        nodeX = Node(x)
        nodeY = Node(y)
        graph.neighborNodes(nodeX, nodeY)
    s = int(f.readline())

    print("running graph.listNodes")
    graph.listNodes()
    print("running print neighbors neighbors")
    graph.printNeighborsNeighbors(4)

The problem is that your are enforcing uniqueness of node objects by index. When neighborNodes() method is called it gets a newly created Node instance which you only add to the self.nodes() if needed, which is "correct": it will only record one instance of Node per index. But you are still creating a new instance of Node which you will throw away, except you first pass it to the Node.createNeighbor() method and that throw-away instance gets recorded as a node's neighbor. As a result only one direction of the the neighborhood relationship is recorded.

Here is one possible fix:

if not self.nodes[node.index]:
    self.nodes[node.index] = node
else:
    node = self.nodes[node.index]

if not self.nodes[neighbor.index]:
    self.nodes[neighbor.index] = neighbor
else:
    neighbor = self.nodes[neighbor.index]

But I don't like it. In reality you need to change it to stop creating throw-away instances, it is not good for memory, performance, readability and correctness. You could add a method called getNode(n) to Graph, which will return a node object if it already exists or a create (and register) a new one if it doesn't exist yet. Then you would make the Node constructors private (probably no way to do that in Python) so that no one else but Graph can create them.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM