简体   繁体   中英

Non-greedy matching of Word in pyparsing?

I would like to match a word that ends with either _foo or _bar . I wrote this:

identifier = Word(alphanums + '_')
string     = identifier + Suppress('_') + oneOf('foo bar')

Unfortunately, I realized identifier is greedy and consume all the keyword.

How do I force identifier to be not greedy?

$ string.parseString('a_keyword_foo')
ParseException: Expected "_" (at char 13), (line:1, col:14)

Some valid keywords:

a_keyword_foo          # ['a_keyword', 'foo']
foo_bar_foo            # ['foo_bar',   'foo']
bar_bar                # ['bar',       'bar']

Some invalid keywords:

keyword_foo_foobar
2keywords_bar          # The leading number is perhaps another question...
foo _bar 
_foo

Once you know for what you're looking, you can use pp.SkipTo :

In [38]: foo_or_bar = Literal('foo') | Literal('bar')

In [39]: string = SkipTo(Literal('_') + foo_or_bar) + Literal('_') + foo_or_bar

In [42]: string.parseString('frumpy _foo')
Out[42]: (['frumpy ', '_', 'foo'], {})

Unfortunately, you also get this behavior, though:

In [44]: string.parseString('frumpy _foo _foo')
Out[44]: (['frumpy ', '_', 'foo'], {})

in case the pattern can appear more than once.

The problem is that pyparsing doesn't do lookahead. If you're concerned about the second case too, you'll have to define it as one or more things ending with underscore + foo or bar (as above), and then take the last one.

If you have to/can switch to the re api you can use non-greedy matching there:

    import re
    p = re.compile (r"""([a-z_]+?)        # lazy matching identifier
                         _ (bar|foo)      # _ with foo or bar
       """, re.VERBOSE)
    subject_string = 'a_hello_foo'
    m = p.match( subject_string )
    print "groups:", m.groups()
    print "group 1:", m.group(1)

Within pyparsing there is also the possibility to use regex.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM