简体   繁体   中英

Python Lark parser: no versions I've installed seem to have the .pretty() print method

Problem:

# From example at https://github.com/lark-parser/lark/blob/master/examples/json_parser.py
from lark import Lark, Transformer, v_args
parse = json_parser.parse
json_grammar = r""" ... """
### Create the JSON parser with Lark, using the LALR algorithm
json_parser = Lark(json_grammar, parser='lalr',
                   # Using the standard lexer isn't required, and isn't usually recommended.
                   # But, it's good enough for JSON, and it's slightly faster.
                   lexer='standard',
                   # Disabling propagate_positions and placeholders slightly improves speed
                   propagate_positions=False,
                   maybe_placeholders=False,
                   # Using an internal transformer is faster and more memory efficient
                   transformer=TreeToJson())

with open(sys.argv[1]) as f:
    tree = parse(f.read())
    print( tree )
    # Errors next 2 lines:
    # No: tree.pretty( indent_str="  " )
    # No: Lark.pretty( indent_str="  " )

Specific Error:

  • AttributeError: type object 'Lark' has no attribute 'pretty'

Setup:

Python version = 3.8.1

In Miniconda 3 on Mac Bug Sur

conda install lark-parser

Installed 0.11.2-pyh44b312d_0

conda upgrade lark-parser

Installed 0.11.3-pyhd8ed1ab_0

Edit: Note about my Goal:

The goal here is NOT just to parse JSON; I just happen to be using a JSON example to try and learn. I want to write my own grammar for some data that I'm dealing with at work.

Edit: Why I Believe Pretty Print Should Exist:

Here's an example that uses the.pretty() function, and even includes output. But I can't seem to find anything (via conda at least) that includes.pretty(): http://github.com/lark-parser/lark/blob/master/docs/json_tutorial.md

I am not sure what I can put in this answer that is not already in the other answer. I will just try to create corresponding examples:

json_parser = Lark(json_grammar, parser='lalr',
                   # Using the standard lexer isn't required, and isn't usually recommended.
                   # But, it's good enough for JSON, and it's slightly faster.
                   lexer='standard',
                   # Disabling propagate_positions and placeholders slightly improves speed
                   propagate_positions=False,
                   maybe_placeholders=False,
                   # Using an internal transformer is faster and more memory efficient
                   transformer=TreeToJson()
)

The important line here is the transformer=TreeToJson() . It tells lark to apply the Transformer class TreeToJson before returing the Tree to you. If you remove that line:

json_parser = Lark(json_grammar, parser='lalr',
                   # Using the standard lexer isn't required, and isn't usually recommended.
                   # But, it's good enough for JSON, and it's slightly faster.
                   lexer='standard',
                   # Disabling propagate_positions and placeholders slightly improves speed
                   propagate_positions=False,
                   maybe_placeholders=False,
)

Then you get the Tree instance with the .pretty method:

tree = json_parser.parse(test_json)
print(tree.pretty())

You can then apply the Transformer manually:

res = TreeToJson().transform(tree)

This is now a 'normal' python object, like you would get from the stdlib json module, so probably a dict onary.

The transformer= option of the Lark construct makes it so that this is done before a Tree was ever created, saving time and memory.

The JSON parser in the Lark examples directory uses a tree transformer to turn the parsed tree into an ordinary JSON object. That makes it possible to verify that the parse is correct by comparing it with the JSON parser in Python's standard library:

    j = parse(test_json)
    print(j)
    import json
    assert j == json.loads(test_json)

The assert at the end could only pass if the value returned by parse had the same type as the object returned by json.loads , which is an ordinary unadorned Python builtin type, typically dict or array .

You might find the pretty printer in the Python standard library useful for this particular application. Or you could use the builtin JSON.dumps function with a non-zero indent keyword argument. (Eg: print(json.dumps(json_value, indent=2)) )

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM