简体   繁体   中英

regex to match a word and everything after it?

I need to dump some http data as a string from the http packet which i have in string format am trying to use the regular expression below to match 'data:'and everything after it,Its not working . I am new to regex and python

>>>import re
>>>pat=re.compile(r'(?:/bdata:/b)?\w$')
>>>string=" dnfhndkn data: ndknfdjoj pop"
>>>res=re.match(pat,string)
>>>print res
   None

re.match matches only at the beginning of the string. Use re.search to match at any position. (See search() vs. match() )

>>> import re
>>> pat = re.compile(r'(?:/bdata:/b)?\w$')
>>> string = " dnfhndkn data: ndknfdjoj pop"
>>> res = re.search(pat,string)
>>> res
<_sre.SRE_Match object at 0x0000000002838100>
>>> res.group()
'p'

To match everything, you need to change \\w with .* . Also remove /b .

>>> import re
>>> pat = re.compile(r'(?:data:).*$')
>>> string = " dnfhndkn data: ndknfdjoj pop"
>>> res = re.search(pat,string)
>>> print res.group()
data: ndknfdjoj pop

No need for a regular expression here. You can just slice the string:

>>> string
' dnfhndkn data: ndknfdjoj pop'
>>> string.index('data')
10
>>> string[string.index('data'):]
'data: ndknfdjoj pop'

str.index('data') returns the point in the string where the substring data is found. The slice from this position to the end string[10:] gives you the part of the string you are interested in.

By the way, string is a potentially problematic variable name if you are planning on using the string module at any point...

you can just do:

string.split("data:")[1]

assuming "data:" appears only once in each string

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM