简体   繁体   中英

Python findall, regex

I have this text:

  u'times_viewed': 12268,
  u'url': u'/photo/79169307/30-seconds-light',
  u'user': {u'affection': 63962,

How can I just get out this string: "/photo/79169307/30-seconds-light" ?

I am trying with regex and findall :

list = re.findall(‘u‘url‘: u‘/photo/"([^"]*)"‘, text)

but it won't go.

I assume that by "it won't go," you mean that you get a syntax error, which you should. Here:

list=re.findall(‘u‘url‘: u‘/photo/"([^"]*)"‘,text)

you're using " when you mean ' . This is causing a syntax error because " closes the string you're trying to pass re.findall . Try:

list_ = re.findall("u'url': u'/photo/([^']*)'", text)

Additionally, this isn't going to grab the text after photo , so you'll need to add more parens:

list_ = re.findall("u'url': u'(/photo/([^']*))'", text)

and now list_.group(1) should hold your string.

On top of that, it looks like you're dealing with JSON. A better approach might be:

import json
json.loads(text)
list_ = text['url']

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM