简体   繁体   English

用于处理由邻居列表函数定义的(潜在无限)图的库

[英]Library for working with (potentially infinite) graphs defined by neighbor-list functions

Here's a pattern I've used countless times across a variety of programming languages: 这是我在多种编程语言中使用了无数次的模式:

  1. Encounter a problem which can be easily reduced to some graph algorithm. 遇到一个可以很容易地归结为某种图形算法的问题。
  2. Define an adjacency function: outEdges :: MyNode -> [MyNode] . 定义一个邻接函数: outEdges :: MyNode -> [MyNode]
  3. Code up some general form of said graph algorithm which takes this function as its first argument. 编写该图算法的某种通用形式,该形式将该函数作为第一个参数。

As an example, consider this (purposefully inefficient) method for computing the edit distance between two words. 例如,考虑这种(故意低效的)方法来计算两个单词之间的编辑距离 We will count the least number of insertions and deletions necessary to transform one word into another via breadth first search. 我们将计算通过广度优先搜索将一个单词转换为另一个单词所需的最少插入和删除次数。

import Data.List
import Data.Maybe

alphabet :: String
alphabet = ['a'..'z']

wordNeighbors :: String -> [String]
wordNeighbors word = deletions ++ insertions where
    insertions = [pre++[c]++suf | (pre,suf) <- splits, c <- alphabet]
    deletions =  [pre++suf      | (pre,_:suf) <- take (length word) splits]

    splits = zip (inits word) (tails word)

shortestDistance :: (Eq a,Hashable a)=> (a -> [a]) -> a -> a -> Maybe Int
shortestDistance edgeFunc source target =
    -- 8 lines of code where I do a breadth-first traversal,
    -- using a HashSet to track previously visited nodes;
    -- yawn...

editDistance :: String -> String -> Int
editDistance a b = fromJust $ shortestDistance wordNeighbors a b

main = print $ editDistance "cat" "can"  -- prints 2

The problem is, I'm getting awfully tired of step 3. (see shortestDistance above...) 问题是, 我对步骤3感到非常厌倦。 (请参阅上面的shortestDistance ...)

I feel like I've written the same algorithms hundreds of times. 我觉得我已经写了数百遍相同的算法。 I'd love it if I could instead just somehow utilize FGL or Data.Graph and be done with it, but as far as I can tell both ultimately require the construction of some sort of Graph data structure which is strict with respect to the set of all nodes. 如果我可以以某种方式利用FGL或Data.Graph并完成它,我会很喜欢,但据我所知,这最终都需要构建某种针对集合严格的Graph数据结构所有节点。 This is an issue because in many problems, the graph is infinite (such as in the example above). 这是一个问题,因为在许多问题中, 图是无限的 (例如在上面的示例中)。

I specifically ask about Haskell because Haskell has such a strong focus on combinators that I somehow expected many of these algorithms to already exist somewhere. 我之所以特别询问Haskell,是因为Haskell非常重视组合器,以某种方式我希望其中的许多算法已经存在于某个地方。


Addendum: Here are other examples of functions I frequently write besides shortest-path: 附录:这是除最短路径之外我经常编写的其他函数示例:

-- Useful for organizing the computation of a recursively-defined
-- property of the nodes in an acyclic graph, such as nimbers.
dfsPostOrder :: (v -> [v]) -> v -> [v]
dfsPostOrder adjFunc root = ...

-- Find all nodes connected in some manner to the root node.
-- In case I know the components are finite size, but am not sure
-- of a nice way to express their contents.
-- (Note: The API below is only good for undirected graphs)
getComponent :: (v -> [v]) -> v -> Set v
getComponent adjFunc root = ...

-- Lazily organize the graph into groups by their minimum distance
-- to any of the nodes in @roots@.
-- One could use this to help incrementalize parts of e.g. a Game
-- of Life or Kinetic Monte Carlo simulation by locating regions
-- invalidated by changes in the state.
groupsByProximity :: (v -> [v]) -> Set v -> [Set v]
groupsByProximity adjFunc roots = ...

TL;DR: Is there any general way to write algorithms that work on potentially infinite, potentially cyclic, directed graphs---such as one defined by an adjacency function ( Node -> [Node] or Node -> [(Node, Weight)] )? TL; DR:是否有任何通用方法可以编写可在无限长,可能循环,有向图上使用的算法,例如由邻接函数( Node -> [Node]Node -> [(Node, Weight)] )?

I think most "breadth-first" search algorithms are really some sort of "best-first" algorithm . 我认为大多数“广度优先”搜索算法实际上都是某种“最佳优先”算法 That is, the search frontier is placed in a priority queue which determines the order in which the nodes are visited. 即,将搜索边界放置在确定队列访问顺序的优先级队列中。

I found two packages which implement general best-first algorithms: 我发现了两个实现一般最佳优先算法的软件包:

Both of these modules have very generic interfaces - ie you supply a neighbor function, an inter-node distance function and (in the case of A-star) a heuristic function. 这两个模块都有非常通用的接口-即,您提供一个邻居函数,一个节点间距离函数和(对于A-star而言)一个启发式函数。

With the appropriate choice of heuristic and distance functions you might be able to map your search into one of these algorithms. 通过适当选择启发式和距离函数,您也许可以将搜索映射到这些算法之一。 For instance, this patent describes a way of employing A-star to solve the edit distance problem. 例如, 该专利描述了一种使用A-star解决编辑距离问题的方法。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM