简体   繁体   中英

Python Regex: why doesn't python accept my pattern?

I want to write Python regex that takes a string of pattern:

"u'Johns's Place',"

and returns:

Johns's Place

It should locate the character 'u', the apostrophe comes after it and then the apostrophe that comes before the comma and returns what there is between these two apostrophes.

Therefore, I wrote the following code:

title = "u'Johns's Place',"
print re.sub(r"u'([^\"']*)',", r"\"\1\"", title)

however, I still got the entire string

"u'Johns's Place',"

with no filtering.

Do you know how it can be resolved?

Python doesn't accept your pattern because of the middle ' in "John's" . It isn't followed by a comma, as described in your pattern. The matching cannot continue to look for a ', because you only allow characters that aren't " or ' with [^\\"']* .

If you want to parse JSON with Python, use json package, not regexen applied to escaped unicode strings.

I don't use Python much but this regex should solve your problem

^u'(.*)',$

from the beginning match the u and single quote, capture anything after that until the single quote and comma at the end

print re.sub(r"^u'(.*)',$", r"\"\1\"", title)

remove ^ and $ if there's more to your string than the replaced (in other words, if there is any context)

After making a bigger research I found this package https://simplejson.readthedocs.io/en/latest/

It can make you read a JSON file without putting u'..' for every string.

import simplejson as json
import requests

response_json = requests.get(<url-address>)
current_json = json.loads(response_json.content)

current_json will not have the character 'u' at the beginnig of every string.

It answers my question partially because it returns keys and values that are delimited by a single quote mark(') and not by quotation marks(") as it's needed in JSON format.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM