简体   繁体   中英

Pyparsing: How to collect all named results from groups?

I'm using pyparsing and I need to be able to collect all of the variable names from an expression. It seems like this should be possible with setResultsName , but for expressions with parens or that are otherwise grouped, the variable names are nested.

For example,

ParserElement.enablePackrat()
LPAREN, RPAREN, COMMA = map(Suppress, "(),")
expr = Forward()

number = pyparsing_common.number
fn_call = Group(CaselessKeyword('safe_divide') + LPAREN + expr + COMMA + expr + RPAREN)
reserved_words = CaselessKeyword('safe_divide')
variable = ~reserved_words + pyparsing_common.identifier

operand = number | fn_call | variable.setResultsName('var', listAllMatches=True)

unary_op = oneOf("! -")
power_op = Literal("^")
multiplicative_op = oneOf("* / %")
additive_op = oneOf("+ -")
logical_op = oneOf("&& ||")

expr <<= infixNotation(
    operand,
    [
        (unary_op, 1, opAssoc.RIGHT),
        (power_op, 2, opAssoc.RIGHT),
        (multiplicative_op, 2, opAssoc.LEFT),
        (additive_op, 2, opAssoc.LEFT),
        (logical_op, 2, opAssoc.LEFT),
    ],
)

parsed = expr.parseString('(a + b) + c', parse_all=True)
print(parsed.dump())

This gives

[[['a', '+', 'b'], '+', 'c']]
[0]:
  [['a', '+', 'b'], '+', 'c']
  - var: [['c']]
    [0]:
      ['c']
  [0]:
    ['a', '+', 'b']
    - var: [['a'], ['b']]
      [0]:
        ['a']
      [1]:
        ['b']
  [1]:
    +
  [2]:
    c

where the variables are returned, but not in an easily accessible format especially for more complex expressions. Is there a way to collect all of the nested variables?

There's a similar question here , but the workaround there would incorrectly label keywords as variables.

You could add a parse action to variable to save its name off to a variable list (be sure to insert this code before calling setResultsName):

found_variables = []
def found_var(s, l, t):
    found_variables.append(t[0])
variable.add_parse_action(found_var)

Be sure to clear the list before calling parse_string a second time.

As I understand it, you want the output to be the list of variables found throughout the tree as a single list.

def gather_named_elements(tree, name):
    named = []
    for i in range(len(tree)):
        if isinstance(tree[i], ParseResults):
            named += tree[i][name].as_list()
            named += gather_named_elements(tree[i], name)
    return list(set([x[0] for x in named]))

print(gather_named_elements(parsed, 'var'))
# OUTPUT: ['a', 'b', 'c']

The order is not deterministic, but you can sort the list if needed.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM