简体   繁体   中英

python, yaml how to parse a string containing apostrophe

I am using python to parse YAML files.

One of the YAML documents contain a dictionary such as follow:

scrapers:
    results: //article[@class='story ']

This apparently causes a problem because the last apostrophe is preceded by a whitespace. If I could remove the whitespace it would solve the problem. However since it is an xpath I can't.

Anyone knows how I could escape that sequence? I looked into other SO question, but solution like wrapping the string in "", or using

scrapers:
  results: //article[@class='story ']

or

scrapers:>
  results: //article[@class='story ']

or

scrapers:
  results: //article[@class='story '']

did not work.

EDIT: I am trying to open a file containing the above expression with:

import yaml
with open('/home/depot/wintergreen/yaml/scrapers.yml', 'r') as f:
    scrapers = yaml.load(f)

However i receive the error: ScannerError: mapping values are not allowed here

pointing at the whitespace after story . I have been trying a suggestion offered by an answerer below, ie to create the yaml expression from a python dict. This works. I i save the yaml to file and load it back again it also does work. However when i create the yaml by typing the exact same characters, then it does not work...

EDIT2: I think the problem stemmed from the fact that i created the yaml file on a window machine and uploaded it on a unix server.

It's easy to find the correct YAML format for a structure: create the structure in Python then use yaml.dump to create the YAML-encoded string:

d = {'scrapers': {'results': "//article[@class='story ']"}}
print d

import yaml
print yaml.dump(d, default_flow_style=False)

The result of which is:

{'scrapers': {'results': "//article[@class='story '"}}

scrapers:
    results: //article[@class='story ']

That's the correct YAML representation, so if you're having a problem, it's with the parser, not the input text. If you use the standard yaml library it should parse fine.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM