简体   繁体   English

使用邻接表实现图形表示时要使用哪种数据结构

[英]Which data structure to use when implementing graph representation using adjacency list

I have a graph that is very big about 1,000,000 nodes and many edges. 我有一个很大的图,大约有1,000,000个节点和许多边。 This is what i wanted to know which is the best suited data structure when implementing an adjacency list. 这是我想知道在实现邻接表时最适合的数据结构。 Here are the objects that i keep track of 这是我要跟踪的对象

  • Edge list 边列表
  • Node to node connection list 节点到节点连接列表

I am coding with python so I used a set(because according to this it has ao(1) average insertion time) for edge list and a dictionary to node to node connection list(by making it completely hashable according to How to make an object properly hashable? ). 我正在与蟒编码所以就用一组(因为根据它具有AO(1)平均插入时间)为边缘列表和字典到节点到节点的连接列表(通过使它根据完全可哈希如何使物体正确地可哈希化? )。 Here is my code 这是我的代码

class node:
    def __init__(self, name = ""):
        self.__name = name

    def getName(self):
        return self.__name

    def __str__(self):
        return self.__name

    def __hash__(self):
        return hash(self.__name)

    def __lt__(self, other):
        if(type(self) != type(other)):
            return NotImplemented
        return self.__name.__lt__(other.__name)

    def __eq__(self, other):
        if(type(self)) != type(other):
            return NotImplemented
        return self.__name  == other.__name

class Edge:
    def __init__(self, name = "", node1 = None, node2 = None, weight = 0):
        self.__name = name
        self.__firstNode = node1
        self.__secondNode = node2
        self.__weight = weight

    def getName(self):
        return self.__name

    def getFirstNode(self):
        return self.__firstNode

    def getSecondNode(self):
        return self.__secondNode

    def getWeight(self):
        return self.__weight

    def __lt__(self, other):
        if(type(self) != type(other)):
            return NotImplemented
        return self.__name.__lt__(other.__name) and self.__firstNode.__lt__(other.__firstNode) and self.__secondNode.__lt__(other.__secondNode) and self.__weight.__lt__(other.__weight)

    def __eq__(self, other):
        if(type(self) != type(other)):
            return NotImplemented
        return self.__name == other.__name and self.__firstNode == other.__firstNode and self.__secondNode == other.__secondNode and self.__weight == other.__weight

    def __str__(self):
        return self.__name + " " + str(self.__firstNode) + " " + str(self.__secondNode) + " " + str(self.__weight)

    def __hash__(self):
        return hash(hash(self.__name) + hash(self.__firstNode) + hash(self.__secondNode) + hash(self.__weight))

class graph:
    def __init__(self):
        self.__nodeToNode = {}
        self.__edgeList = set()

    def addEdge(self, edge):
        if(type(edge) != type(Edge())):
            return False

        self.__edgeList.add(edge)
        if(not edge.getFirstNode() in self.__nodeToNode):
            self.__nodeToNode[edge.getFirstNode()] = set()

        self.__nodeToNode[edge.getFirstNode()].add(edge.getSecondNode())
        if(not edge.getSecondNode() in self.__nodeToNode):
            self.__nodeToNode[edge.getSecondNode()] = set()

        self.__nodeToNode[edge.getSecondNode()].add(edge.getSecondNode())
        return True
    def getNodes(self):
        return dict(self.__nodeToNode)
    def getEdges(self):
        return set(self.__edgeList)


import string
import random
import time

grp = graph()
nodes = [None] * 20000
for i in range(20000):
    st = ''.join(random.SystemRandom().choice(string.ascii_letters) for i in range(10))
    node1 = node(st)
    nodes[i] = node1

current = time.time()
for i in range(3000000):
    rdm = random.randint(0, 199)
    rdm2 = random.randint(0, 199)
    st = ''.join(random.SystemRandom().choice(string.ascii_letters) for i in range(10))
    eg = Edge(st, nodes[rdm], nodes[rdm2])
    grp.addEdge(eg)

last = time.time()

print((last - current))

nodes = grp.getNodes()
edges = grp.getEdges()

but this code runs very slowly can i make it faster? 但是这段代码运行得很慢,我可以使其更快吗? If so by using what data structure? 如果可以,使用什么数据结构?

Let me introduce you a way to create an adjacency list: 让我为您介绍一种创建邻接表的方法:

Suppose you have the input like this: 假设您有这样的输入:

4 4
1 2
3 2
4 3
1 4

The first line contains 2 numbers V and E , the next E lines defines an edge between two vertices. 第一行包含两个数字VE ,接下来的E行定义两个顶点之间的边。

You can either create a .txt file and read the input or directly type in via sys.stdin.read() : 您可以创建.txt文件并读取输入内容,也可以直接通过sys.stdin.read()

input = sys.stdin.read()
data = list(map(int, input.split()))
n, m = data[0:2]
data = data[2:]
edges = list(zip(data[0:(2 * m):2], data[1:(2 * m):2]))
x, y = data[2 * m:]
adj = [[] for _ in range(n)]
x, y = x - 1, y - 1
for (a, b) in edges:
    adj[a - 1].append(b - 1)
    adj[b - 1].append(a - 1)

Let's output the adjacency list adj : 让我们输出邻接列表adj

>>> print(adj)
[[1, 3], [0, 2], [1, 3], [2, 0]]

adj[0] have two adj nodes: 1 and 3. Meaning the node 1 have two adj nodes: 2 and 4. adj[0]具有两个adj节点:1和3。这意味着节点1具有两个adj节点:2和4。

And now, if you want a directed, weighted graph, you just need to modify the input like this: 现在,如果您想要一个有向加权图,只需修改输入即可:

4 4
1 2 3 # edge(1, 2) has the weight of 3
3 2 1
4 3 1
1 4 2

input = sys.stdin.read()
data = list(map(int, input.split()))
n, m = data[0:2]
data = data[2:]
edges = list(zip(zip(data[0:(3 * m):3], data[1:(3 * m):3]), data[2:(3 * m):3]))
data = data[3 * m:]
adj = [[] for _ in range(n)]
cost = [[] for _ in range(n)]
for ((a, b), w) in edges:
    adj[a - 1].append(b - 1)
    cost[a - 1].append(w)

You store the weight in cost , and for example, cost[0][1] = 3, cost[0][3] = 2. 您将权重存储在cost ,例如, cost[0][1] = 3, cost[0][3] = 2。

Hope this helped! 希望这对您有所帮助!

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM