简体   繁体   中英

Python parsing nested ordereddicts

What if the file was like this:

OrderedDict
([
 ('activateable', False),
 ('Thisfield', 
    [OrderedDict ([ ('autoNumber', False),  ('name', ‘col_1’),  (‘amount’, ‘10’)]),
    [OrderedDict ([ ('autoNumber', False),  ('name', ‘col_2’),  (‘amount’, ‘10’)])
 ),
('Thisfield2', 
    [OrderedDict ([ ('autoNumber', False),  ('name', ‘col_1’),  (‘amount’, ‘10’)]),
    [OrderedDict ([ ('autoNumber', False),  ('name', ‘col_2’),  (‘amount’, ‘10’)])
 ),
('Thisfield3', 
    [OrderedDict ([ ('autoNumber', False),  ('name', ‘col_1’),  (‘amount’, ‘10’)]),
    [OrderedDict ([ ('autoNumber', False),  ('name', ‘col_2’),  (‘amount’, ‘10’)])
 )
 ('pin', False)
])

...and I only wanted to return 'Thisfield1, Thisfield2, Thisfield3'?

At first I thought your input is Python, but it is not:

  • it has Unicode left and right quotation marks (U+2018/U+2019)
  • it is has unbalanced square brackets
  • would at least need a comma before ('pin', False)

So given the tags on your question, this has to be a YAML document, which means it has aa single multi-line plain scalar as content. And when you load that using a YAML parser, you will get the whole scalar loaded as a single string without newlines:

OrderedDict ([ ('activateable', False), ('Thisfield', [OrderedDict ([ ('autoNumber', False),  ('name', ‘col_1’),  (‘amount’, ‘10’)]), [OrderedDict ([ ('autoNumber', False),  ('name', ‘col_2’),  (‘amount’, ‘10’)]) ), ('Thisfield2', [OrderedDict ([ ('autoNumber', False),  ('name', ‘col_1’),  (‘amount’, ‘10’)]), [OrderedDict ([ ('autoNumber', False),  ('name', ‘col_2’),  (‘amount’, ‘10’)]) ), ('Thisfield3', [OrderedDict ([ ('autoNumber', False),  ('name', ‘col_1’),  (‘amount’, ‘10’)]), [OrderedDict ([ ('autoNumber', False),  ('name', ‘col_2’),  (‘amount’, ‘10’)]) ) ('pin', False) ])

which is not as easy to parse as the original input file.

So you probalby have an easier time just "parsing" the input lines:

def get_thisfields(fp):
    vals = []
    for line in fp:
        line = line.strip()
        if not line.startswith(u"('This"):
            continue
        vals.append(line.split("'")[1])
    return ', '.join(vals)

print(get_thisfields(open('input.yaml')))

given your input "YAML" file, get_thisfields() returns:

Thisfield, Thisfield2, Thisfield3

as you requested.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM