简体   繁体   中英

Get Substring from the String based on the matching

I am having string as follows

srring1 = "1/1/0/A1,A2:admin-status=up,id=admin-up"
string2 = "1/1/0/A1,A2:id:admin-up,admin-status=up"
string2 = "1/1/0/A1,A2:id=admin-down:admin-status=up"

My Output will be as follows:

print(string1) = admin-up
print(string2) = admin-up
print(string3) = admin-down

In the string my keyword is "id" by using this substring we have to retrieve value of that substring which is "admin-up". After "id" substring it has any special character like ": or =". we have to retrieve the value after that special character.

You can do this with the built-in re library for regular expressions:

>>> import re
>>> strings = [ 
...     '1/1/0/A1,A2:admin-status=up,id=admin-up', 
...     '1/1/0/A1,A2:id:admin-up,admin-status=up', 
...     '1/1/0/A1,A2:id=admin-down:admin-status=up' 
... ]
>>> [re.search('id[:=]([^,:]+)',id).groups(0)[0] for id in strings if re.search('id[=:].+', id)]   
['admin-up', 'admin-up', 'admin-down']

This will include IDs that don't have a hyphen. The Regex breaks down like this:

id Means "look for the literal string ID" [:=] Means "followed by either an = or a : ( Capture the things that follow in a capture group [^,:]+ Capture as many characters that are not , or : as you can ) Close the capture group.

A simpler version of the regex is used to filter out strings that don't match at all. In function form:

>>> def get_id(log):
...     match = re.search('id[:=]([^,:]+)',log)
...     if not match:
...             return None
...     return match.groups(0)[0]
... 
>>> get_id('1/1/0/A1,A2:admin-status=up,id=admin-up')
'admin-up'
>>> get_id('1/1/0/A1,A2:id:admin-up,admin-status=up')
'admin-up'
>>> get_id('1/1/0/A1,A2:id=admin-down:admin-status=up')
'admin-down'
>>> get_id('no id found here')
>>> 

You can use the following pattern and function to get the group

import re

string1 = "1/1/0/A1,A2:admin-status=up,id=admin-up"
string2 = "1/1/0/A1,A2:id:admin-up,admin-status=up"
string3 = "1/1/0/A1,A2:id=admin-down:admin-status=up"

regex = r'id[:=](\w+-\w+)'

string1_id = re.search(regex, string1).group(1)
string2_id = re.search(regex, string2).group(1)
string3_id = re.search(regex, string3).group(1)

EDIT Let me provide an explanation to this solution. OP wants to extract a phrase/ word after the keyword id which is followed by either : or = . The regex that was chosen was r'id[:=](\\w+-\\w+)' , which essentially finds a substring within the main string that has the word id followed by either characters [:=] and extract the word that follows (\\w+-\\w+) . The brackets here indicates a group , which is of the interest here. The \\w+ specifies alphanumeric characters (at least 1 character) followed by a dash - and another word after that.

This part re.search(regex, string1).group(1) finds regex within string1 and extract the first group group(1) . If there's two matches and you want to extract the second group, you can change it to group(2) .

print(string1_id)
print(string2_id)
print(string3_id)

Output:

admin-up
admin-up
admin-down

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM