如何使用变量和标记实现马尔可夫算法？

Question

I've been trying to implement Markov's algorithm, but I've only had partial success. 我一直在尝试实现马尔可夫算法，但仅获得了部分成功。 The algorithm is fairly simple and can be found here . 该算法非常简单，可以在此处找到。

However, my project has an added difficulty, I have to use rules that include markers and variables. 但是，我的项目有一个额外的困难，我必须使用包含标记和变量的规则。

A variable represents any letter in the alphabet and a marker is simply a character that is used as a reference to move the variables around (It doesn't have a real value). 变量代表字母中的任何字母，标记只是一个字符，用作将变量四处移动的参考（它没有实数值）。

This example duplicates every character in a string: 此示例复制字符串中的每个字符：

Alphabet: {a,b,c} 字母：{a，b，c}

Markers: {M} 标记：{M}

Variables: {x} 变量：{x}

Rule 1: Mx -> xxM 规则1：Mx-> xxM

Rule 2: xM -> x 规则2：xM-> x

Rule 3: x -> Mx 规则3：x-> Mx

input: abc 输入：abc

abc //We apply rule 3 abc //我们应用规则3

Mabc //We apply rule 1 Mabc //我们应用规则1

aaMbc //We apply rule 1 aaMbc //我们应用规则1

aabbMc //We apply rule 1 aabbMc //我们应用规则1

aabbccM //We apply rule 2 aabbccM //我们应用规则2

aabbcc 为aabbcc

This is my recursive function that implements a markov algorithm that only works with string inputs for example: Rule 1: "apple" -> "orange", Input: "apple". 这是我的递归函数，实现了markov算法，该算法仅适用于字符串输入，例如：规则1：“苹果”->“橙色”，输入：“苹果”。

public static String markov(String input, LinkedList<Rule> rules) {
    for (Rule rule : rules) {
        if (!input.equals(input.replace(rule.getFrom(), rule.getTo()))) { //If the rule matches a substring
            if (rule.isTerminating()) { //If the rule is terminating
                input = input.replaceFirst(Pattern.quote(rule.getFrom()), rule.getTo());
                System.out.println(input); //Replace the first instance
                return input; //return and end the cycle
            } else {
                input = input.replaceFirst(Pattern.quote(rule.getFrom()), rule.getTo());
                System.out.println(input);
                return markov(input, rules); //Start looking again for matching rules
            }
        }
    }
    return input;
}

I can't figure out how to implement variables and markers into this logic so perhaps someone can educate me on the best way to implement this logic? 我不知道如何在此逻辑中实现变量和标记，以便有人可以教育我实现该逻辑的最佳方法？ any advice is welcome. 任何建议都欢迎。

If the question doesn't comply with SO guidelines please let me know why in the comments so I don't repeat the mistake. 如果问题不符合SO准则，请在评论中告诉我原因，以免重复错误。

Thank You! 谢谢！

GitHub GitHub上

Answer 1

I think the easiest way to do this is using Java regular expressions. 我认为最简单的方法是使用Java正则表达式。 Once you get your head around those, then the following rules should work for your example: 一旦解决了这些问题，下面的规则将适用于您的示例：

Rule 1: "M([a-c])" -> "$1$1M"
Rule 2: "([a-c])M" -> "$1" (terminating)
Rule 3: "([a-c])"  -> "M$1"

Note that you need a couple of tweaks to your current method to make this work... 请注意，您需要对当前方法进行一些调整才能使其正常工作。

replace takes a literal string as it's first parameter whereas replaceFirst uses a regex, so: replace将文字字符串作为第一个参数，而replaceFirst使用正则表达式，因此：

replace: if (!input.equals(input.replace(rule.getFrom(), rule.getTo()))) {
with:    if (!input.equals(input.replaceFirst(rule.getFrom(), rule.getTo()))) {

You are quoting the rule.getFrom() string, which will not work with regular expressions, so: 您引用的是rule.getFrom()字符串，该字符串不适用于正则表达式，因此：

replace: input = input.replaceFirst(Pattern.quote(rule.getFrom()), rule.getTo());
with:    input = input.replaceFirst(rule.getFrom(), rule.getTo());

At that point, you have a bit of duplication in the code calling replaceFirst twice, so you could stick that in a temp variable the first time and reuse it: 到那时，您在两次调用replaceFirst的代码中有一些重复，因此您可以将它第一次粘贴在temp变量中replaceFirst用它：

String next = input.replace(rule.getFrom(), rule.getTo());
if (!input.equals(next)) {
  ...
  input = next;
  ...
}

As you are currently quoting the entire rule.getFrom() string I'm guessing you have had problems with regular expression special characters in this before. 当您当前引用整个rule.getFrom()字符串时，我想您以前在使用正则表达式特殊字符时遇到了问题。 If so, you'll need to address them individually when creating the rules. 如果是这样，则在创建规则时需要分别解决它们。 I really don't want to get into regular expressions here as it is a huge area and is completely separate to the Markov algorithm, so if you are having problems with these then please do some research online (eg Regular Expressions and Capturing Groups ), or ask a separate question here focusing on the regular expression specific problem. 我真的不想在这里进入正则表达式，因为它是一个很大的领域，并且与Markov算法完全分开，所以如果您对此有疑问，请在线进行一些研究（例如，正则表达式和捕获组），或在此处针对正则表达式特定的问题提出一个单独的问题。

Note that you can still combine these with the normal rules so (changing the marker character from M to # to allow M to be used in the alphabet), these rules: 请注意，您仍然可以将这些规则与常规规则结合使用（将标记字符从M更改为#以允许在字母表中使用M ），请遵循以下规则：

"A"             -> "apple"
"B"             -> "bag"
"S"             -> "shop"
"T"             -> "the"
"the shop"      -> "my brother"
"#([a-zA-Z .])" -> "$1$1#"
"([a-zA-Z .])#" -> "$1" (terminating)
"([a-zA-Z .])"  -> "#$1"

Would convert: 将转换为：

from: I bought a B of As from T S.
to:   II  bboouugghhtt  aa  bbaagg  ooff  aapppplleess  ffrroomm  mmyy  bbrrootthheerr..

Hope this helps. 希望这可以帮助。

如何使用变量和标记实现马尔可夫算法？

问题描述

1 个解决方案

解决方案1
0 已采纳 2015-10-05 04:35:25

如何使用变量和标记实现马尔可夫算法？

问题描述

1 个解决方案

解决方案1 0 已采纳 2015-10-05 04:35:25

解决方案1
0 已采纳 2015-10-05 04:35:25