在python中提取指定字符之间的字符串

Question

I'm a newbie to regular expressions and I have the following string: 我是正则表达式的新手，并且具有以下字符串：

sequence = '["{\"First\":\"Belyuen,NT,0801\",\"Second\":\"Belyuen,NT,0801\"}","{\"First\":\"Larrakeyah,NT,0801\",\"Second\":\"Larrakeyah,NT,0801\"}"]'

I am trying to extract the text Belyuen,NT,0801 and Larrakeyah,NT,0801 in python. 我正在尝试在python中提取文本Belyuen,NT,0801和Larrakeyah,NT,0801 。 I have the following code which is not working: 我有以下代码无法正常工作：

re.search('\:\\"...\\', ''.join(sequence))

Ie I want to get the string between characters :\\ and \\ . 即我想获取字符:\\和\\之间的字符串。

Answer 1

Don't use regex for this. 不要为此使用正则表达式。 It appears to be a rather strangely split set of JSON strings. 它似乎是一组相当奇怪的JSON字符串集合。 Join them back together and use the json module to decode it. 将它们重新结合在一起，并使用json模块对其进行解码。

import json
sequence = '[%s]' % ','.join(sequence)
data = json.loads(sequence)
print data[0]['First'], data[0]['Second']

(Note the json module is new in Python2.6 - if you have a lower version, download and install simplejson). （请注意，json模块是Python2.6中的新增功能-如果您的版本较低，请下载并安装simplejson）。

Answer 2

it seems like a proper serialization of the Python dict, you could just do: 看起来像是对python字典的正确序列化，您可以这样做：

>>> sequence = ["{\"First\":\"Belyuen,NT,0801\",\"Second\":\"Belyuen,NT,0801\"}","{\"First\":\"Larrakeyah,NT,0801\",\"Second\":\"Larrakeyah,NT,0801\"}"]
>>> import json
>>> for i in sequence:
    d = json.loads(i)
    print(d['First'])


Belyuen,NT,0801
Larrakeyah,NT,0801

Answer 3

you don't need regex 你不需要正则表达式

>>> sequence = ["{\"First\":\"Belyuen,NT,0801\",\"Second\":\"Belyuen,NT,0801\"}","{\"First\":\"Larrakeyah,NT,0801\",\"Second\":\"Larrakeyah,NT,0801\"}"]
>>> for item in sequence:
...  print eval(item).values()
...
['Belyuen,NT,0801', 'Belyuen,NT,0801']
['Larrakeyah,NT,0801', 'Larrakeyah,NT,0801']

在python中提取指定字符之间的字符串

问题描述

3 个解决方案

解决方案1
3 已采纳 2010-04-03 13:46:11

解决方案2
3 2010-04-03 13:46:43

解决方案3
2 2010-04-03 13:47:47

在python中提取指定字符之间的字符串

问题描述

3 个解决方案

解决方案1 3 已采纳 2010-04-03 13:46:11

解决方案2 3 2010-04-03 13:46:43

解决方案3 2 2010-04-03 13:47:47

解决方案1
3 已采纳 2010-04-03 13:46:11

解决方案2
3 2010-04-03 13:46:43

解决方案3
2 2010-04-03 13:47:47