Working my way through:
Pyparsing Quick Reference, Chapter 3: Small Example -
The example parser is supposed to match valid Python identifiers, so
'a_#'
should be invalid, like the author comments it to be, right? However, at the bottom of the page:
---Test for 'a_#'
Matches: ['a', '_']
Here's the parser:
first = pp.Word(pp.alphas+"_", exact=1)
rest = pp.Word(pp.alphanums+"_")
identifier = first+pp.Optional(rest)
I'm not sure, so I'd like some feedback before contacting the author.
Also, I'm trying to correct it by constructing a parser that would only accept the defined character range within the whole string, so it wouldn't match a prefix of it. Can't get it right - any advice?
Yikes! Building up an identifier using two Word
s is wasteful, inefficient, and just bad pyparsing practice. I think the author was doing this as a buildup to showing how Combine
could be used here, but afterword, he should show the better alternative using just a single Word
expression.
Word
has a two-argument format (clearly described in the online docs ) for just this situation:
valid_ident_leading_chars = alphas + '_'
valid_ident_body_chars = alphanums + '_'
identifier = Word(valid_ident_leading_chars, valid_ident_body_chars)
(BTW, this is equivalent to:
identifier = Regex('['+valid_ident_leading_chars+']['+valid_ident_body_chars+']*')
And if you look in the pyparsing code, you'll see that Word
implements its matching by building that very regular expression.)
This will still parse the leading part of 'a_#', the same as a regular expression would. If you want your test to fail because the full string was not parsed, use:
identifier.parseString('a_#', parseAll=True)
For simplicity in writing tests, you can also use '==' - when comparing a pyparsing expression with a string, the expression will run expr.parseString(comparison_string, parseAll=True)
, and return True/False depending on whether a ParseException was raised or not.
assert 'a_' == identifier # <-- will pass
assert 'a_#' == identifier # <-- will fail
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.