简体   繁体   中英

in a Python regex matchobj, how to do a substring match, and assign the match to a variable?

Given this re.sub and 'replace' function - thanks, Ignacio, for the pointer! - I am able to replace all matches in my very long text blob with the string ' * NONSENSE * ' - so far, so good!

Along the way, I'd like to find the substring within the matchobj, calling it ' findkey ', so I can do additional work with it...

How do to this?

data = re.sub('(:::[A-Z,a-z,:]+:::)', replace, data)

def replace(matchobj):
 if matchobj.group(0) != '':

  # this seems to work:
  tag = matchobj.group(1)

  # but this doesn't:
  findkey = re.search(':::([A-Z,a-z]+):::', tag)

  return '********************  NONSENSE  ********************'

 else:
  return ''

Are you looking for

findkey = re.search(':::([A-Z,a-z]+):::', tag).group()

Note the group() and also this document can help.

Try this. You can match the inner part as part of the initial sub call.

import re

data = ":::::::::::BLAH:::::::::, ::::::::MORE:::::::"

def replace(matchobj):
  # this seems to work:
  tag = matchobj.group(0)
  findkey = matchobj.group(1)

  print findkey

  return '********************  NONSENSE  ********************'


data = re.sub(r':::(?P<inner>[A-Z,a-z]+):::', replace, data)

print data

returns the following

BLAH
MORE
::::::::********************  NONSENSE  ********************::::::, :::::********************  NONSENSE  ********************::::

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM