简体   繁体   English

如何从简单的Regex生成字符串?

[英]How to generate strings from simple Regex?

So if I have simple regex such as: 因此,如果我有简单的正则表达式,例如:

"g{1,3}(a|e|i|o|u)"

I want my program to generate strings of 我希望我的程序生成字符串

ga
ge
gi
go
gu
gga
gge
ggi
ggo
ggu
ggga
ggge
gggi
gggo
gggu

I would not use "g*(a|e|i|o|u)" for regex, as there can be infinite number of 'g's and there will be infinite number of strings. 我不会在正则表达式中使用“ g *(a | e | i | o | u)”,因为可以有无限多个“ g”,并且将有无限多个字符串。

Any recommendation on simple efficient algorithm to make this? 关于简单有效算法的任何建议可以做到这一点? I think I will be able to make these strings in a brute force way by using for/while loops, but I'm wondering if there is any methods I could use to make this algorithm work. 我想我可以通过使用for / while循环以强力方式制作这些字符串,但是我想知道是否有什么方法可以使该算法起作用。

I googled how to create strings from regex and many people seemed to redirect to: https://code.google.com/p/xeger/ to use the library that is built, but I was wondering if I could get some suggestions to make my own for these simple regex. 我用谷歌搜索了如何从正则表达式创建字符串,许多人似乎都重定向到: https : //code.google.com/p/xeger/以使用生成的库,但是我想知道是否可以提出一些建议我自己的这些简单的正则表达式。

Xeger is open source. Xeger是开源的。 You could browse their code base for ideas. 您可以浏览他们的代码库以获取想法。

EDIT: 编辑:

Their code base looks very small, so shouldn't be too hard. 他们的代码库看起来很小,所以不要太难。 It only generates random strings that will match, not all strings. 它只会生成将匹配的随机字符串,而不是所有字符串。 It could still be a good starting point though. 不过,这仍然可能是一个很好的起点。

I created Debuggex , which generates random strings to give you an idea of what a regex does. 我创建了Debuggex ,它会生成随机字符串以使您了解正则表达式的功能。

If you already have a parse tree for your regex, you can use the following logic to generate random matches: 如果您已经有用于正则表达式的分析树,则可以使用以下逻辑来生成随机匹配项:

OrTree.random:
    Choose a child randomly, return its random()

ConcatTree.random:
    For every child, call random()
    Return the concatenation of all the results

RepeatTree.random:
    Choose a valid random number of repetitions within min and max
    Call random() on your child that many times
    Return the concatenation of all the results

Literal.random:
    Return the literal

You can generate random strings even if you use the * operator. 即使使用*运算符,也可以生成随机字符串。 This is done by choosing a distribution from 0 to infinity from which to generate numbers, just like you use a uniform distribution for finite sets. 这是通过选择从0到无穷大的分布来生成数字来完成的,就像您对有限集使用统一分布一样。 For example: 例如:

InfiniteRepeatTree.random:
    Flip a coin until I get tails
    Call random on child() the number of times that the coin landed heads
    Return concatenation of the results

Hope that helps :) 希望有帮助:)

char[] vowels = new char[] {'a','e','i','o','u'};
for (int i = 1; i <= 3; i++) {
    for (int j = 0; j < vowels.length; j++) {
         for (int k = 0; k < i; k++) {
             System.out.print("g");
         }
         System.out.println(vowels[j]);
    }
}

Not generic solution, but it works for your specific example 不是通用解决方案,但适用于您的特定示例

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM