How to search for question mark and / with regular expression ? python

Question

I want to search a file for a number that match this pattern:

<a  href="test/?n=451484"   >

then get the number 451484 :

I use this pattern :

'
(test/?n=)
\d+
'

but that doesn't work ?

Answer 1

3 changes

escape the ?
wrap the d+ in paranthesis
drop paranthesis around test\\?n=

Example usage

>>> import re
>>> str='<a  href="test/?n=451484"   >'
>>> re.findall(r'test/\?n=(\d+)', str)
['451484']

Answer 2

To search for a literal ? character, you need to escape it with a \\ . ? is a special character in regexes, and cannot (usually) be used on its own.

pattern = r"test/\?n=(\d+)"

Answer 3

Alternatively, you can use specialized tools:

an HTML Parser to parse the HTML data (for example, BeautifulSoup )
urlparse to extract the url parameter value

Example:

import re
from urlparse import urlparse, parse_qs
from bs4 import BeautifulSoup

data = """
<div>
    <a href="test/?n=451484">link</a>
</div>
"""

soup = BeautifulSoup(data)

# filtering links with a specific "href" attribute value    
link = soup.find('a', href=re.compile(r'test/\?n=\d+'))

url = link['href']
query = urlparse(url).query
print parse_qs(query)['n'][0]  # prints 451484

How to search for question mark and / with regular expression ? python

Question

3 answers

solution1
1 ACCPTED 2014-11-25 17:52:09

solution2
0 2014-11-25 17:51:18

solution3
0 2014-11-25 17:59:12

How to search for question mark and / with regular expression ? python

Question

3 answers

solution1 1 ACCPTED 2014-11-25 17:52:09

solution2 0 2014-11-25 17:51:18

solution3 0 2014-11-25 17:59:12

solution1
1 ACCPTED 2014-11-25 17:52:09

solution2
0 2014-11-25 17:51:18

solution3
0 2014-11-25 17:59:12