[英]How can I implement Markov's algorithm with variables and markers?
I've been trying to implement Markov's algorithm, but I've only had partial success. 我一直在尝试实现马尔可夫算法,但仅获得了部分成功。 The algorithm is fairly simple and can be found here .
该算法非常简单,可以在此处找到。
However, my project has an added difficulty, I have to use rules that include markers and variables. 但是,我的项目有一个额外的困难,我必须使用包含标记和变量的规则。
A variable represents any letter in the alphabet and a marker is simply a character that is used as a reference to move the variables around (It doesn't have a real value). 变量代表字母中的任何字母,标记只是一个字符,用作将变量四处移动的参考(它没有实数值)。
This example duplicates every character in a string: 此示例复制字符串中的每个字符:
Alphabet: {a,b,c}
字母:{a,b,c}
Markers: {M}
标记:{M}
Variables: {x}
变量:{x}
Rule 1: Mx -> xxM
规则1:Mx-> xxM
Rule 2: xM -> x
规则2:xM-> x
Rule 3: x -> Mx
规则3:x-> Mx
input: abc
输入:abc
abc //We apply rule 3
abc //我们应用规则3
Mabc //We apply rule 1
Mabc //我们应用规则1
aaMbc //We apply rule 1
aaMbc //我们应用规则1
aabbMc //We apply rule 1
aabbMc //我们应用规则1
aabbccM //We apply rule 2
aabbccM //我们应用规则2
aabbcc
为aabbcc
This is my recursive function that implements a markov algorithm that only works with string inputs for example: Rule 1: "apple" -> "orange", Input: "apple". 这是我的递归函数,实现了markov算法,该算法仅适用于字符串输入,例如:规则1:“苹果”->“橙色”,输入:“苹果”。
public static String markov(String input, LinkedList<Rule> rules) {
for (Rule rule : rules) {
if (!input.equals(input.replace(rule.getFrom(), rule.getTo()))) { //If the rule matches a substring
if (rule.isTerminating()) { //If the rule is terminating
input = input.replaceFirst(Pattern.quote(rule.getFrom()), rule.getTo());
System.out.println(input); //Replace the first instance
return input; //return and end the cycle
} else {
input = input.replaceFirst(Pattern.quote(rule.getFrom()), rule.getTo());
System.out.println(input);
return markov(input, rules); //Start looking again for matching rules
}
}
}
return input;
}
I can't figure out how to implement variables and markers into this logic so perhaps someone can educate me on the best way to implement this logic? 我不知道如何在此逻辑中实现变量和标记,以便有人可以教育我实现该逻辑的最佳方法? any advice is welcome.
任何建议都欢迎。
If the question doesn't comply with SO guidelines please let me know why in the comments so I don't repeat the mistake. 如果问题不符合SO准则,请在评论中告诉我原因,以免重复错误。
Thank You! 谢谢!
I think the easiest way to do this is using Java regular expressions. 我认为最简单的方法是使用Java正则表达式。 Once you get your head around those, then the following rules should work for your example:
一旦解决了这些问题,下面的规则将适用于您的示例:
Rule 1: "M([a-c])" -> "$1$1M"
Rule 2: "([a-c])M" -> "$1" (terminating)
Rule 3: "([a-c])" -> "M$1"
Note that you need a couple of tweaks to your current method to make this work... 请注意,您需要对当前方法进行一些调整才能使其正常工作。
replace
takes a literal string as it's first parameter whereas replaceFirst
uses a regex, so: replace
将文字字符串作为第一个参数,而replaceFirst
使用正则表达式,因此:
replace: if (!input.equals(input.replace(rule.getFrom(), rule.getTo()))) {
with: if (!input.equals(input.replaceFirst(rule.getFrom(), rule.getTo()))) {
You are quoting the rule.getFrom()
string, which will not work with regular expressions, so: 您引用的是
rule.getFrom()
字符串,该字符串不适用于正则表达式,因此:
replace: input = input.replaceFirst(Pattern.quote(rule.getFrom()), rule.getTo());
with: input = input.replaceFirst(rule.getFrom(), rule.getTo());
At that point, you have a bit of duplication in the code calling replaceFirst
twice, so you could stick that in a temp variable the first time and reuse it: 到那时,您在两次调用
replaceFirst
的代码中有一些重复,因此您可以将它第一次粘贴在temp变量中replaceFirst
用它:
String next = input.replace(rule.getFrom(), rule.getTo());
if (!input.equals(next)) {
...
input = next;
...
}
As you are currently quoting the entire rule.getFrom()
string I'm guessing you have had problems with regular expression special characters in this before. 当您当前引用整个
rule.getFrom()
字符串时,我想您以前在使用正则表达式特殊字符时遇到了问题。 If so, you'll need to address them individually when creating the rules. 如果是这样,则在创建规则时需要分别解决它们。 I really don't want to get into regular expressions here as it is a huge area and is completely separate to the Markov algorithm, so if you are having problems with these then please do some research online (eg Regular Expressions and Capturing Groups ), or ask a separate question here focusing on the regular expression specific problem.
我真的不想在这里进入正则表达式,因为它是一个很大的领域,并且与Markov算法完全分开,所以如果您对此有疑问,请在线进行一些研究(例如, 正则表达式和捕获组 ),或在此处针对正则表达式特定的问题提出一个单独的问题。
Note that you can still combine these with the normal rules so (changing the marker character from M
to #
to allow M
to be used in the alphabet), these rules: 请注意,您仍然可以将这些规则与常规规则结合使用(将标记字符从
M
更改为#
以允许在字母表中使用M
),请遵循以下规则:
"A" -> "apple"
"B" -> "bag"
"S" -> "shop"
"T" -> "the"
"the shop" -> "my brother"
"#([a-zA-Z .])" -> "$1$1#"
"([a-zA-Z .])#" -> "$1" (terminating)
"([a-zA-Z .])" -> "#$1"
Would convert: 将转换为:
from: I bought a B of As from T S.
to: II bboouugghhtt aa bbaagg ooff aapppplleess ffrroomm mmyy bbrrootthheerr..
Hope this helps. 希望这可以帮助。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.