简体   繁体   English

生成随机确定性有限自动机的算法是什么?

[英]What is the algorithm for generating a random Deterministic Finite Automata?

The DFA must have the following four properties: DFA必须具有以下四个属性:

  • The DFA has N nodes DFA有N个节点

  • Each node has 2 outgoing transitions. 每个节点有2个传出转换。

  • Each node is reachable from every other node. 每个节点都可以从每个其他节点到达。

  • The DFA is chosen with perfectly uniform randomness from all possibilities 从所有可能性中选择DFA具有完全均匀的随机性

This is what I have so far: 这是我到目前为止:

  1. Start with a collection of N nodes. 从N个节点的集合开始。
  2. Choose a node that has not already been chosen. 选择尚未选择的节点。
  3. Connect its output to 2 other randomly selected nodes 将其输出连接到其他2个随机选择的节点
  4. Label one transition 1 and the other transition 0. 标记一个转换1,另一个转换0。
  5. Go to 2, unless all nodes have been chosen. 除非已选择所有节点,否则转到2。
  6. Determine if there is a node with no incoming connections. 确定是否存在没有传入连接的节点。
  7. If so, steal an incoming connection from a node with more than 1 incoming connection. 如果是,则从具有多于1个传入连接的节点窃取传入连接。
  8. Go to 6, unless there are no nodes with no incoming connections 转到6,除非没有没有传入连接的节点

However, this is algorithm is not correct. 但是,这是算法不正确。 Consider the graph where node 1 has its two connections going to node 2 (and vice versa), while node 3 has its two connection going to node 4 (and vice versa). 考虑图表,其中节点1有两个连接到节点2(反之亦然),而节点3有两个连接到节点4(反之亦然)。 That is something like: 这就是:

1 <==> 2 1 <==> 2

3 <==> 4 3 <==> 4

Where, by <==> I mean two outgoing connections both ways (so a total of 4 connections). 其中,通过<==>我的意思是两种方式的两个传出连接(所以总共4个连接)。 This seems to form 2 cliques, which means that not every state is reachable from every other state. 这似乎形成了两个派系,这意味着并非每个州都可以从其他所有州获得。

Does anyone know how to complete the algorithm? 有谁知道如何完成算法? Or, does anyone know another algorithm? 或者,有没有人知道另一种算法? I seem to vaguely recall that a binary tree can be used to construct this, but I am not sure about that. 我似乎模糊地回忆起可以使用二叉树来构造它,但我不确定。

Strong connectivity is a difficult constraint. 强连接是一个困难的约束。 Let's generate uniform random surjective transition functions and then test them with eg Tarjan's linear-time SCC algorithm until we get one that's strongly connected. 让我们生成均匀的随机投射过渡函数,然后使用例如Tarjan的线性时间SCC算法对它们进行测试,直到得到一个强连接的算法。 This process has the right distribution, but it's not clear that it's efficient; 这个过程有正确的分布,但不清楚它是否有效; my researcher's intuition is that the limiting probability of strong connectivity is less than 1 but greater than 0, which would imply only O(1) iterations are necessary in expectation. 我的研究人员的直觉是,强连通性的极限概率小于1但大于0,这意味着只有O(1)迭代在期望中是必要的。

Generating surjective transition functions is itself nontrivial. 生成满射过渡函数本身就是不平凡的。 Unfortunately, without that constraint it is exponentially unlikely that every state has an incoming transition. 不幸的是,如果没有这种限制,每个州都有一个进入过渡的指数不可能。 Use the algorithm described in the answers to this question to sample a uniform random partition of {(1, a), (1, b), (2, a), (2, b), …, (N, a), (N, b)} with N parts. 使用此问题的答案中描述的算法来采样{(1,a),(1,b),(2,a),(2,b),...,(N,a)的均匀随机分区, (N,b)}有N个部分。 Permute the nodes randomly and assign them to parts. 随机置换节点并将它们分配给零件。

For example, let N = 3 and suppose that the random partition is 例如,设N = 3并假设随机分区为

{{(1, a), (2, a), (3, b)}, {(2, b)}, {(1, b), (3, a)}}.

We choose a random permutation 2, 3, 1 and derive a transition function 我们选择随机置换2,3,1并导出转移函数

(1, a) |-> 2
(1, b) |-> 1
(2, a) |-> 2
(2, b) |-> 3
(3, a) |-> 1
(3, b) |-> 2

There is a expected running time O(n^{3/2}) algorithm. 有一个预期的运行时间O(n ^ {3/2})算法。

If you generate a uniform random digraph with m vertices such that each vertex has k labelled out-arcs (a k-out digraph), then with high probability the largest SCC (strongly connected component) in this digraph is of size around c_k m, where c_k is a constant depending on k. 如果生成具有m个顶点的均匀随机有向图,使得每个顶点具有k个标记的外弧(k-out有向图),那么该有向图中最大SCC(强连通分量)的大小约为c_k m,其中c_k是常数,取决于k。 Actually, there is about 1/\\sqrt{m} probability that the size of this SCC is exactly c_k m (rounded to an integer). 实际上,这个SCC的大小恰好是c_k m(四舍五入为整数)的概率约为1 / \\ sqrt {m}。

So you can generate a uniform random 2-out digraph of size n/c_k, and check the size of the largest SCC. 因此,您可以生成大小为n / c_k的均匀随机2-out图,并检查最大SCC的大小。 If its size is not exactly n, just try again until success. 如果它的大小不完全是n,那么再试一次直到成功。 The expected number of trials needed is \\sqrt{n}. 预期的试验次数是\\ sqrt {n}。 And generating each digraph should be done in O(n) time. 生成每个有向图应该在O(n)时间内完成。 So in total the algorithm has expected running time O(n^{3/2}). 因此总的来说算法预期运行时间为O(n ^ {3/2})。 See this paper for more details. 有关详细信息,请参阅此文章

In what follows I'll use the basic terminology of graph theory . 在下文中,我将使用图论的基本术语。

You could: 你可以:

  1. Start with a directed graph with N vertices and no arcs. 从具有N个顶点且没有弧的有向图开始。
  2. Generate a random permutation of the N vertices to produce a random Hamiltonian cycle, and add it to the graph. 生成N个顶点的随机排列以产生随机哈密顿循环,并将其添加到图中。
  3. For each vertex add one outgoing arc to a randomly chosen vertex. 对于每个顶点,将一个外出弧添加到随机选择的顶点。

The result will satisfy all three requirements. 结果将满足所有三个要求。

The following references seem to be relevant to your question: 以下参考文献似乎与您的问题相关:

F. Bassino, J. David and C. Nicaud, Enumeration and random generation of possibly incomplete deterministic automata, Pure Mathematics and Applications 19 (2-3) (2009) 1-16. F. Bassino,J。David和C. Nicaud,可能不完全确定性自动机的计数和随机生成, 纯数学和应用 19 (2-3)(2009)1-16。

F. Bassino and C. Nicaud. F. Bassino和C. Nicaud。 Enumeration and Random Generation of Accessible Automata. 可访问自动机的枚举和随机生成。 Theor. 理论值。 Comp. 比较。 Sc. SC。 . 381 (2007) 86-104. 381 (2007)86-104。

Just keep growing a set of nodes which are all reachable. 只需保持一组可以访问的节点。 Once they're all reachable, fill in the blanks. 一旦他们都可以到达,填写空白。

Start with a set of N nodes called A.
Choose a node from A and put it in set B.
While there are nodes left in set A
    Choose a node x from set A
    Choose a node y from set B with less than two outgoing transitions.
    Choose a node z from set B
    Add a transition from y to x.
    Add a transition from x to z
    Move x to set B
For each node n in B
    While n has less than two outgoing transitions
         Choose a node m in B
         Add a transition from n to m
Choose a node to be the start node.
Choose some number of nodes to be accepting nodes.

Every node in set B can reach every node in set B. As long as a node can be reached from a node in set B and that node can reach a node in set B, it can be added to the set. 集合B中的每个节点都可以到达集合B中的每个节点。只要可以从集合B中的节点到达节点并且该节点可以到达集合B中的节点,就可以将其添加到集合中。

The simplest way that I can think of is to (uniformly) generate a random DFA with N nodes and two outgoing edges per node, ignoring the other constraints, and then throw away any that are not strongly connected (which is easy to test using a strongly connected components algorithm). 我能想到的最简单的方法是(统一)生成一个随机DFA,每个节点有N节点和两个传出边,忽略其他约束,然后扔掉任何没有强连接的(这很容易使用一个强连通分量算法)。 Generating uniform DFAs should be straightforward without the reachability constraint. 在没有可达性约束的情况下,生成统一的DFA应该是直截了当的。 The one thing that could be problematic performance-wise is how many DFAs you would need to skip before you found one with the reachability property. 在性能方面可能存在问题的一件事是,在找到具有可达性属性的DFA之前,您需要跳过多少个DFA。 You should try this algorithm first, though, and see how long it ends up taking to generate an acceptable DFA. 但是,您应该首先尝试此算法,并查看它最终需要多长时间才能生成可接受的DFA。

We can start with a random number of states N1 between N and 2N. 我们可以从N和2N之间的随机数量的状态N1开始。

Assume the initial state the as the state number 1. For each state, for each character in the input alphabet we generate a random transition (between 1 and N1). 假设初始状态为状态编号1.对于每个状态,对于输入字母表中的每个字符,我们生成随机转换(在1和N1之间)。

We take the connex automaton starting from the initial state. 我们从初始状态开始采用connex自动机。 We check the number of states, and after few tries we get one with N states. 我们检查状态的数量,经过几次尝试,我们得到一个N状态。

If we wish a minimal automaton too, remains only the assignment of final states, however there are great chances that a random assignment gets a minimal automaton as well. 如果我们也希望最小的自动机,那么只保留最终状态的分配,但是随机分配也很有可能获得最小的自动机。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM