简体   繁体   English

在十亿个节点的无向​​图中无循环地从正好k个边的源节点中找到目标节点的算法/方法

[英]Algorithm/Approach to find destination node from source node of exactly k edges in undirected graph of billion nodes without cycle

Consider I have an adjacency list of billion of nodes structured using hash table arranged in the following manner: 考虑一下,我有一个以哈希表构造的十亿个节点的邻接列表,哈希表的排列方式如下:

key = source node 键=源节点
value = hash_table { node1, node2, node3} 值= hash_table {node1,node2,node3}

The input values are from text file in the form of 输入值来自文本文件,格式为
from , to
1,2 1,2
1,5 1,5
1,11 1,11
... so on ...等等

eg. 例如。 key = '1' 键='1'
value = {'2','5','11'} 值= {'2','5','11'}
means 1 is connected to nodes 2,5,11 装置1连接到节点2,5,11

I want to know an algorithm or approach to find destination node from source node of exactly k edges in an undirected graph of billion nodes without cycle 我想知道一种算法或方法,可以从无循环的十亿个节点的无向​​图中的正好k个边的源节点中找到目标节点

for eg. 例如 from node 1 I want to find node 50 only till depth 3 or till 3 edges. 从节点1我只想找到节点50,直到深度3或直到3个边缘。

My assumption the algorithm finds 1 - 2 - 60 - 50 which is the shortest path but how would the traversing be efficient using the above adjacency list structure? 我的假设是该算法找到最短路径1-2-60-50,但是使用上述邻接表结构如何有效地进行遍历? I do not want to use Hadoop/Map Reduce. 我不想使用Hadoop / Map Reduce。

I came up with naive solution as below in Python but is not efficient. 我在Python中提出了以下朴素的解决方案,但效率不高。 Only thing is hash table searches key in O(1) so I could just search neighbours and their billion neighbours directly for the key. 唯一的问题是哈希表在O(1)中搜索关键字,因此我可以直接搜索邻居及其十亿个邻居以获取密钥。 The following algo takes lot of time. 以下算法需要很多时间。

  1. start with source node 从源节点开始
  2. use hash table search for finding key 使用哈希表搜索来查找密钥
  3. go 1 level deeper with hash table of neighbor nodes and find their values for destination nodes until node found 使用邻居节点的哈希表更深一层,找到目标节点的值,直到找到节点
  4. Stop if node not found on k depth 如果在k深度上找不到节点,则停止
&nbsp1 &nbsp1
| |
{2 5 11} {2 5 11}
| | | | | |
{3,6,7} {nodes} {nodes} .... connected nodes {3,6,7} {nodes} {nodes} ....个连接的节点
| | | | | | | | | |
{nodes} {nodes} {nodes} .... million more connected nodes. {nodes} {nodes} {nodes} ....百万个连接的节点。


Please suggest. 请提出建议。 The algorithm above implemented similar to BFS takes more than 3 hours to search for all the possible key value relationships. 以上类似于BFS实施的算法需要3个多小时才能搜索所有可能的键值关系。 Can be it be reduced with other searching method? 可以用其他搜索方法减少吗?

As you've hinted, this will depend a lot on the data access characteristics of your system. 正如您所暗示的,这将在很大程度上取决于系统的数据访问特性。 If you were restricted to single-element accesses, then you'd be truly stuck, as trincot observes. Trincot所言 ,如果您仅限于单元素访问,那么您将真正陷入困境 However, if you can manage block accesses, then you have a chance of parallel operations. 但是,如果您可以管理块访问,那么就有机会进行并行操作。

However, I think that would be outside your control: the hash function owns the adjacency characteristics -- and, in fact, will likely "pessimize" (opposite of "optimize") that characteristic. 但是,我认为这超出了您的控制范围:哈希函数拥有邻接特征-实际上,可能会“悲观”(与“优化”相反)该特征。

I do see one possible hope: use iteration instead of recursion, maintaining a list of nodes to visit. 我确实看到了一个可能的希望:使用迭代而不是递归,维护要访问的节点列表。 When you place a new node on the list, get its hash value. 在列表上放置新节点时,获取其哈希值。 If you can organize the nodes clustered by location, you can perhaps do a block transfer, accessing several values in one read operation. 如果您可以按位置组织群集的节点,则可以进行块传输,一次读取操作即可访问多个值。

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 在给定节点数和节点度的情况下,生成许多具有多条边的不同随机无向图的最快算法 - Fatest algorithm to generate many distinct random undirected graphs with multiple edges given number of nodes and node degree 使用networkx计算加权无向图中连接到一个节点的每个子图中的节点和边数 - Count the number of nodes and edges in each subgraph connected to one node in a weighted undirected graph with networkx 最短路径GENERATION,在定向和加权图中具有正好k个边(编辑:仅访问每个节点一次) - Shortest path GENERATION with exactly k edges in a directed and weighted graph (edit: visit each node only once) 根据边的权重从加权无向图提取连接节点 - extracting connected nodes from weighted undirected graph based on the weight of the edges 有什么快速的算法可以找到一条短路径以至少遍历一次加权无向图的每个节点? - What's a fast algorithm that can find a short path to traverse each node of a weighted undirected graph at least once? 有向图中的深度:在距离给定节点 k 处查找节点 - Depth in Directed Graph: Finding nodes at k distance from a given node 为无向图中的所有节点存储 dist(node, start) 的算法 - Algoriithm to store dist(node, start) for all nodes in an undirected graph 使用迭代方法从 Python 中的起始节点到达图中的目标节点 - Reaching a destination node in a graph from a start node in Python using iterative approach 图论-当一个节点的所有边都包含在一个循环中时 - Graph theory - when all edges of a node are contained within a cycle 在定向加权图中查找最短路径,该路径访问每个节点,对重新访问节点和边缘没有限制 - Find shortest path in directed, weighted graph that visits every node with no restrictions on revisiting nodes and edges
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM