简体   繁体   中英

Extracting a string between specified characters in python

I'm a newbie to regular expressions and I have the following string:

sequence = '["{\"First\":\"Belyuen,NT,0801\",\"Second\":\"Belyuen,NT,0801\"}","{\"First\":\"Larrakeyah,NT,0801\",\"Second\":\"Larrakeyah,NT,0801\"}"]'

I am trying to extract the text Belyuen,NT,0801 and Larrakeyah,NT,0801 in python. I have the following code which is not working:

re.search('\:\\"...\\', ''.join(sequence))

Ie I want to get the string between characters :\\ and \\ .

Don't use regex for this. It appears to be a rather strangely split set of JSON strings. Join them back together and use the json module to decode it.

import json
sequence = '[%s]' % ','.join(sequence)
data = json.loads(sequence)
print data[0]['First'], data[0]['Second']

(Note the json module is new in Python2.6 - if you have a lower version, download and install simplejson).

it seems like a proper serialization of the Python dict, you could just do:

>>> sequence = ["{\"First\":\"Belyuen,NT,0801\",\"Second\":\"Belyuen,NT,0801\"}","{\"First\":\"Larrakeyah,NT,0801\",\"Second\":\"Larrakeyah,NT,0801\"}"]
>>> import json
>>> for i in sequence:
    d = json.loads(i)
    print(d['First'])


Belyuen,NT,0801
Larrakeyah,NT,0801

you don't need regex

>>> sequence = ["{\"First\":\"Belyuen,NT,0801\",\"Second\":\"Belyuen,NT,0801\"}","{\"First\":\"Larrakeyah,NT,0801\",\"Second\":\"Larrakeyah,NT,0801\"}"]
>>> for item in sequence:
...  print eval(item).values()
...
['Belyuen,NT,0801', 'Belyuen,NT,0801']
['Larrakeyah,NT,0801', 'Larrakeyah,NT,0801']

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM