PyParsing - 語法元素圍繞其他元素拆分

Question

我正在移動一個工具（不是我寫的）來使用 PyParsing。 我正在更新語法以使其更有意義，但也希望向后兼容。 語法包括被另一個元素分割的元素，我需要“動詞”（包裝元素）和“值”（包裝元素）。 （不，這些不是球體，盡管它們看起來像它們 - 這很令人困惑，也是我改變它的部分原因）。

*thing*  # verb: contains      value: thing
*thing   # verb: starts_with   value: thing
thing*   # verb: ends_with     value: thing
!*thing* # verb: not_contains  value: thing

我很難理解如何解析*thing*之類的東西，其中“動詞”包裹在“價值”周圍。 這些元素也在一個分隔列表中，盡管這部分我很好。

我將解析的完整示例：

command *thing*, !*other_thing*, *third_thing

我試過的：

import pyparsing as pp

command = pp.Keyword("command").setResultsName("command")
value = pp.Word(pp.alphanums + "_").setResultsName("value", listAllMatches=True)

contains = ("*" + value + "*").setResultsName("verb", listAllMatches=True)
not_contains = ("!*" + value + "*").setResultsName("verb", listAllMatches=True)
starts_with = ("*" + value).setResultsName("verb", listAllMatches=True)

verbs_and_values = (
    contains
    | not_contains
    | starts_with
)

directive = pp.Group(command + pp.delimitedList(verbs_and_values, delim=","))

example = "command *thing*, !*other_thing*, *third_thing"

result = directive.parseString(example)
print result.dump()

這讓我得到了所有的價值，但動詞是整個事情（即['*', 'thing', '*'] ）。 我嘗試使用類似於以下的 parseAction 調整動詞：

def process_verb(tokens):
    if tokens[0] == '*' and tokens[-1] == '*':
        return "contains"
    # handle other verbs...

效果很好，但它破壞了價值觀......

Answer 1

我看到您使用listAllMatches=True的結果名稱來捕獲 delimitedList 中的多個解析值。 這對於簡單的數據結構是可以的，但是一旦您想為給定值存儲多個值，那么您將需要開始使用 Group 或解析操作類。

作為一般做法，我避免在低級表達式中使用結果名稱，而是在使用 '+' 和 '|' 組合高級表達式時添加它們運營商。 我也主要使用expr("name")形式而不是expr.setResultsName("name")形式來設置結果名稱。

這是使用組的代碼的修改版本：

command = pp.Keyword("command")
value = pp.Word(pp.alphanums + "_")

contains = pp.Group("*" + value("value") + "*")
not_contains = pp.Group("!*" + value("value") + "*")
starts_with = pp.Group("*" + value("value"))

我還添加了命令的結果名稱和directive中的動詞列表：

directive = pp.Group(command("command")
                     + pp.Group(pp.delimitedList(verbs_and_values, 
                                        delim=","))("verbs"))

現在這些表達式被包裝在組中，沒有必要使用listAllMatches=True ，因為現在每個值都保存在自己單獨的組中。

解析結果現在如下所示：

[['command', ['*', 'thing', '*'], ['!*', 'other_thing', '*'], ['*', 'third_thing']]]
[0]:
  ['command', ['*', 'thing', '*'], ['!*', 'other_thing', '*'], ['*', 'third_thing']]
  - command: 'command'
  - verbs: [['*', 'thing', '*'], ['!*', 'other_thing', '*'], ['*', 'third_thing']]
    [0]:
      ['*', 'thing', '*']
      - value: 'thing'
    [1]:
      ['!*', 'other_thing', '*']
      - value: 'other_thing'
    [2]:
      ['*', 'third_thing']
      - value: 'third_thing'

您在使用解析操作來添加有關動詞類型的信息的正確軌道上，但不是返回該值，而是希望將動詞的類型添加為另一個命名結果。

def add_type_parse_action(verb_type):
    def pa(s, l, t):
        t[0]["type"] = verb_type
    return pa

contains.addParseAction(add_type_parse_action("contains"))
not_contains.addParseAction(add_type_parse_action("not_contains"))
starts_with.addParseAction(add_type_parse_action("starts_with"))

添加解析操作后，您將獲得以下結果：

[['command', ['*', 'thing', '*'], ['!*', 'other_thing', '*'], ['*', 'third_thing']]]
[0]:
  ['command', ['*', 'thing', '*'], ['!*', 'other_thing', '*'], ['*', 'third_thing']]
  - command: 'command'
  - verbs: [['*', 'thing', '*'], ['!*', 'other_thing', '*'], ['*', 'third_thing']]
    [0]:
      ['*', 'thing', '*']
      - type: 'contains'
      - value: 'thing'
    [1]:
      ['!*', 'other_thing', '*']
      - type: 'not_contains'
      - value: 'other_thing'
    [2]:
      ['*', 'third_thing']
      - type: 'starts_with'
      - value: 'third_thing'

您還可以定義類來為您的結果提供結構。 由於 class 像解析操作一樣被“調用”，因此 Python 將使用解析的標記構造一個 class 實例：

class VerbBase:
    def __init__(self, tokens):
        self.tokens = tokens[0]

    @property
    def value(self):
        return self.tokens.value
    
    def __repr__(self):
        return "{}(value={!r})".format(type(self).__name__, self.value)

class Contains(VerbBase): pass
class NotContains(VerbBase): pass
class StartsWith(VerbBase): pass

contains.addParseAction(Contains)
not_contains.addParseAction(NotContains)
starts_with.addParseAction(StartsWith)

result = directive.parseString(example)
print(result.dump())

現在結果在 object 實例中，其類型表明使用了哪種動詞：

[['command', [Contains(value='thing'), NotContains(value='other_thing'), StartsWith(value='third_thing')]]]
[0]:
  ['command', [Contains(value='thing'), NotContains(value='other_thing'), StartsWith(value='third_thing')]]
  - command: 'command'
  - verbs: [Contains(value='thing'), NotContains(value='other_thing'), StartsWith(value='third_thing')]

注意：在整個問題中，您將命令后面的項目稱為“動詞”，我在此答案中保留了該名稱，以便更容易與您的初始嘗試進行比較。 但通常，“動詞”會指代一些動作，比如指令中的“命令”，而后面的項目更像是“限定詞”或“參數”。 名稱在編碼時很重要，不僅在與他人交流時，甚至在形成您自己的代碼正在做什么的心理概念時。 對我來說，這里的“動詞”，通常是句子中的動作，更像是命令，而以下部分我稱之為“限定詞”、“參數”或“主題”。

PyParsing - 語法元素圍繞其他元素拆分

問題描述

1 個解決方案

解決方案1
1 已采納 2021-02-22 02:58:46

PyParsing - 語法元素圍繞其他元素拆分

問題描述

1 個解決方案

解決方案1 1 已采納 2021-02-22 02:58:46

解決方案1
1 已采納 2021-02-22 02:58:46