簡體   English   中英

如何使用findall或search在python中提取數據?

[英]How to use findall or search to extract data in python?

這是我的繩子,

str = 'A:[{type:"mb",id:9,name:"John",url:"/mb9/",cur:0,num:83498},
{type:"mb",id:92,name:"Mary",url:"/mb92/",cur:0,num:404},
{type:"mb",id:97,name:"Dan",url:"/mb97/",cur:0,num:139},
{type:"mb",id:268,name:"Jennifer",url:"/mb268/",cur:0,num:0},
{type:"mb",id:289,name:"Mike",url:"/mb289/",cur:0,num:0}],B:
[{type:"mb",id:157,name:"Sue",url:"/mb157/",cur:0,num:35200},
{type:"mb",id:3,name:"Rob",url:"/mb3/",cur:0,num:103047},
{type:"mb",id:2,name:"Tracy",url:"/mb2/",cur:0,num:87946},
{type:"mb",id:26,name:"Jenny",url:"/mb26/",cur:0,num:74870},
{type:"mb",id:5,name:"Florence",url:"/mb5/",cur:0,num:37261},
{type:"mb",id:127,name:"Peter",url:"/mb127/",cur:0,num:63711},
{type:"mb",id:15,name:"Grace",url:"/mb15/",cur:0,num:63243},
{type:"mb",id:82,name:"Tony",url:"/mb82/",cur:0,num:6471},
{type:"mb",id:236,name:"Lisa",url:"/mb236/",cur:0,num:4883}]'

我想使用findall或搜索從str中提取“ name”和“ url”下的所有數據。 這就是我所做的

pattern = re.comile(r'type:(.*),id:(.*),name:(.*),url:(.*),cur:(.*),num:
(.*)')

for (v1, v2, v3, v4, v5, v6) in re.findall(pattern, str):
    print v3
    print v4

但是不幸的是,這並不能滿足我的要求。 有什么問題嗎? 感謝您的投入。

您可以嘗試以下方法:

import re
data = """
A:[{type:"mb",id:9,name:"John",url:"/mb9/",cur:0,num:83498},
{type:"mb",id:92,name:"Mary",url:"/mb92/",cur:0,num:404},
{type:"mb",id:97,name:"Dan",url:"/mb97/",cur:0,num:139},
{type:"mb",id:268,name:"Jennifer",url:"/mb268/",cur:0,num:0},
{type:"mb",id:289,name:"Mike",url:"/mb289/",cur:0,num:0}],B:
[{type:"mb",id:157,name:"Sue",url:"/mb157/",cur:0,num:35200},
{type:"mb",id:3,name:"Rob",url:"/mb3/",cur:0,num:103047},
{type:"mb",id:2,name:"Tracy",url:"/mb2/",cur:0,num:87946},
{type:"mb",id:26,name:"Jenny",url:"/mb26/",cur:0,num:74870},
{type:"mb",id:5,name:"Florence",url:"/mb5/",cur:0,num:37261},
{type:"mb",id:127,name:"Peter",url:"/mb127/",cur:0,num:63711},
{type:"mb",id:15,name:"Grace",url:"/mb15/",cur:0,num:63243},
{type:"mb",id:82,name:"Tony",url:"/mb82/",cur:0,num:6471},
{type:"mb",id:236,name:"Lisa",url:"/mb236/",cur:0,num:4883}]
"""
full_data = [i[1:-1] for i in re.findall('(?<=name:)".*?"(?=,)|(?<=url:)".*?"(?=,)', data)]
final_data = [full_data[i]+":"+full_data[i+1] for i in range(0, len(full_data)-1, 2)]
print(full_data)

輸出量

['John:/mb9/', 'Mary:/mb92/', 'Dan:/mb97/', 'Jennifer:/mb268/', 'Mike:/mb289/', 'Sue:/mb157/', 'Rob:/mb3/', 'Tracy:/mb2/', 'Jenny:/mb26/', 'Florence:/mb5/', 'Peter:/mb127/', 'Grace:/mb15/', 'Tony:/mb82/', 'Lisa:/mb236/']

您不應將字符串稱為“ str”,因為這是一個內置函數。 但這是您的一個選擇:

# Find all of the entries
x = re.findall('(?<![AB]:)(?<=:).*?(?=[,}])', s)

['"mb"', '9', '"John"', '"/mb9/"', '0', '83498', '"mb"', '92', '"Mary"', 
'"/mb92/"', '0', '404', '"mb"', '97', '"Dan"', '"/mb97/"', '0', '139', 
'"mb"', '268', '"Jennifer"', '"/mb268/"', '0', '0', '"mb"', '289', '"Mike"', 
'"/mb289/"', '0', '0', '"mb"', '157', '"Sue"', '"/mb157/"', '0', '35200', 
'"mb"', '3', '"Rob"', '"/mb3/"', '0', '103047', '"mb"', '2', '"Tracy"', 
'"/mb2/"', '0', '87946', '"mb"', '26', '"Jenny"', '"/mb26/"', '0', '74870', 
'"mb"', '5', '"Florence"', '"/mb5/"', '0', '37261', '"mb"', '127', '"Peter"', 
'"/mb127/"', '0', '63711', '"mb"', '15', '"Grace"', '"/mb15/"', '0', '63243', 
'"mb"', '82', '"Tony"', '"/mb82/"', '0', '6471', '"mb"', '236', '"Lisa"', 
'"/mb236/"', '0', '4883']

# Break up into each section
y = []
for i in range(0, len(x), 6):
    y.append(x[i:i+6])

[['"mb"', '9', '"John"', '"/mb9/"', '0', '83498']
['"mb"', '92', '"Mary"', '"/mb92/"', '0', '404']
['"mb"', '97', '"Dan"', '"/mb97/"', '0', '139']
['"mb"', '268', '"Jennifer"', '"/mb268/"', '0', '0']
['"mb"', '289', '"Mike"', '"/mb289/"', '0', '0']
['"mb"', '157', '"Sue"', '"/mb157/"', '0', '35200']
['"mb"', '3', '"Rob"', '"/mb3/"', '0', '103047']
['"mb"', '2', '"Tracy"', '"/mb2/"', '0', '87946']
['"mb"', '26', '"Jenny"', '"/mb26/"', '0', '74870']
['"mb"', '5', '"Florence"', '"/mb5/"', '0', '37261']
['"mb"', '127', '"Peter"', '"/mb127/"', '0', '63711']
['"mb"', '15', '"Grace"', '"/mb15/"', '0', '63243']
['"mb"', '82', '"Tony"', '"/mb82/"', '0', '6471']
['"mb"', '236', '"Lisa"', '"/mb236/"', '0', '4883']]

# Name is 3rd value in each list and url is 4th
for i in y:
    name = i[2]
    url = i[3]

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM