Python findall, regex

Question

I have this text:

  u'times_viewed': 12268,
  u'url': u'/photo/79169307/30-seconds-light',
  u'user': {u'affection': 63962,

How can I just get out this string: "/photo/79169307/30-seconds-light" ?

I am trying with regex and findall :

list = re.findall(‘u‘url‘: u‘/photo/"([^"]*)"‘, text)

but it won't go.

Answer 1

I assume that by "it won't go," you mean that you get a syntax error, which you should. Here:

list=re.findall(‘u‘url‘: u‘/photo/"([^"]*)"‘,text)

you're using " when you mean ' . This is causing a syntax error because " closes the string you're trying to pass re.findall . Try:

list_ = re.findall("u'url': u'/photo/([^']*)'", text)

Additionally, this isn't going to grab the text after photo , so you'll need to add more parens:

list_ = re.findall("u'url': u'(/photo/([^']*))'", text)

and now list_.group(1) should hold your string.

On top of that, it looks like you're dealing with JSON. A better approach might be:

import json
json.loads(text)
list_ = text['url']