简体   繁体   English

类似于正则表达式的语法或CFG,用于生成级联字符串变量和文字的笛卡尔积

[英]Regex-like syntax or CFG for generating cartesian product of concatenated string variables and literals

I am writing a simulator, and would like to run studies by invoking a lot of instances of the simulator, using different sets of command-line arguments. 我正在编写一个模拟器,并且想通过使用不同的命令行参数集调用许多模拟器实例来运行研究。 I have read this question and several others, and they seem close, but I'm actually not looking for random data fulfilling a particular regex, I would like the set of all strings that match the regex. 我已经阅读了这个问题以及其他几个问题,它们似乎很接近,但是我实际上并不是在寻找满足特定正则表达式的随机数据,我希望所有与正则表达式匹配的字符串的集合。 An example input file would look something like this: 输入文件示例如下所示:

myprogram.{version1|version2} -arg1 {1|2|4} {-arg2|}

or: 要么:

myprogram.{0} -arg1 {1} {2}
0: "version1" "version2"
1: "1" "2" "4"
2: "-arg2" ""

and would produce: 并会产生:

myprogram.version1 -arg1 1 -arg2
myprogram.version1 -arg1 1
myprogram.version1 -arg1 2 -arg2
myprogram.version1 -arg1 2
myprogram.version1 -arg1 4 -arg2
myprogram.version1 -arg1 4
myprogram.version2 -arg1 1 -arg2
myprogram.version2 -arg1 1
myprogram.version2 -arg1 2 -arg2
myprogram.version2 -arg1 2
myprogram.version2 -arg1 4 -arg2
myprogram.version2 -arg1 4

I would imagine something like this already exists, I just don't know the correct term to search for. 我想像这样的东西已经存在,我只是不知道要搜索的正确术语。 Any help would be much appreciated. 任何帮助将非常感激。 I can implement an abstract technique or algorithm myself if need be, but if it's a pre-existing tool I would prefer it to be free (at least as in beer) and run on Linux. 如果需要,我可以自己实现一种抽象技术或算法,但是如果它是一个预先存在的工具,我希望它是免费的(至少像在啤酒中一样)并可以在Linux上运行。

I know I am probably leaving some details out, and can be more specific about the appropriate things if necessary, rather than inundate people with a lot of detail up front. 我知道我可能会遗漏一些细节,并且在必要时可以更详细地说明适当的事情,而不是先淹没很多细节的人。 It is entirely possible that I am going about this the wrong way, and I am welcome to all solutions, even if they solve my problem in a different way. 我完全有可能走错路了,欢迎所有解决方案,即使它们以不同的方式解决了我的问题。

Most importantly, this solution should not require me to write any extra parsing code if I want to add more argument options to the "cross-product" of strings I generate. 最重要的是,如果要向生成的字符串的“叉积”添加更多参数选项,则此解决方案不需要我编写任何额外的解析代码。 I already have a Perl script that does this with a set of nested for loops over each "variable" that must change every time I change the number or nature of variables. 我已经有一个Perl脚本,通过在每个“变量”上嵌套一组for循环来执行此操作,每次“变量”的数量或性质更改时,这些循环都必须更改。

As long as the braces are not nested, regular expressions will work fine. 只要括号不嵌套,正则表达式就可以正常工作。 If you require nesting, you could add some extra recursion in the implementation language. 如果需要嵌套,则可以在实现语言中添加一些额外的递归。

Here is an example in Python: 这是Python中的示例:

import re

def make_choices(template):
    pat = re.compile(r'(.*?)\{([^{}]+)\}',re.S)

    # tokenize the string
    last_end = 0
    choices = []
    for match in pat.finditer(template):
        prefix, alts = match.groups()
        if prefix:
            choices.append((prefix,)) # as a tuple
        choices.append(alts.split("|"))
        last_end = match.end()

    suffix = template[last_end:]
    if suffix:
        choices.append((suffix,))

    # recursive inner function
    def chooser(index):
        if index >= len(choices):
            yield []
        else:
            for alt in choices[index]:
                for result in chooser(index+1):
                    result.insert(0,alt)
                    yield result

    for result in chooser(0):
        yield ''.join(result)

Example: 例:

>>> for result in make_choices('myprogram.{version1|version2} -arg1 {1|2|4} {-arg2|}'):
...     print result
...
myprogram.version1 -arg1 1 -arg2
myprogram.version1 -arg1 1
myprogram.version1 -arg1 2 -arg2
myprogram.version1 -arg1 2
myprogram.version1 -arg1 4 -arg2
myprogram.version1 -arg1 4
myprogram.version2 -arg1 1 -arg2
myprogram.version2 -arg1 1
myprogram.version2 -arg1 2 -arg2
myprogram.version2 -arg1 2
myprogram.version2 -arg1 4 -arg2
myprogram.version2 -arg1 4

You could use os.system() to execute the commands from within Python: 您可以使用os.system()从Python内部执行命令:

#!/etc/env python
import sys, os

template = ' '.join(sys.args)
failed = 0
total = 0
for command in make_choices(template):
    print command
    if os.system(command):
        print 'FAILED'
        failed += 1
    else:
        print 'OK'
    total += 1

print
print '%d of %d failed.' % (failed,total)

sys.exit(failed > 0)

And then on the command line: 然后在命令行上:

user:/home/> template.py 'program.{version1|version2}'
program.version1
OK
program.version2
FAILED

1 of 2 failed.

You're not really looking for something that regular expressions were designed for. 您并不是真正在寻找正则表达式设计的对象。 You're just looking for a tool that generates combinations of discrete options. 您只是在寻找一种可生成离散选项组合的工具。

Depending on the set of all possible arguments, an exhaustive list of combinations may not actually be necessary. 根据所有可能的参数的集合,实际上可能不需要详尽的组合列表。 Regardless, you should look into Pairwise Testing . 无论如何,您应该研究成对测试 I know for a fact that the PICT tool can generate you either the exhaustive list or the pairwise list of test cases you desire. 对于一个事实,我知道PICT工具可以为您生成所需的详尽的测试案例列表或成对的测试案例列表。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM