简体   繁体   中英

How can i extract words from a string before colon and excluding \n from them in python using regex

I would like to extract from a string the words that are before colon (:) but without the \n characters

I have the following string:

enx002: connected to Wired connection 2
docker0: connected to docker0
virbr0: connected to virbr0
tun0: connected to tun0

I would like to extract the words before:

I used a regular expression:

pattern = '[^:-].*?:\s*'
xx = re.findall(pattern, stringfromabove)

I do get a list of strings as expected just that they are taking also: at the end as well \n example: xx[3] will be '\ntun0: '

What I want is to be just 'tun0'

thank you for the help

You can use

re.findall(r'^[^:-][^:]*', stringfromabove, re.M)
re.findall(r'^([^:\n-][^:\n]*):', stringfromabove, re.M) # This will make sure there is a `:` on the line and `\n` will make sure the match does not span multiple lines

See the regex demo . Details :

  • ^ - start of a line
  • [^:-] - any char but : and -
  • [^:]* - any 0 or more chars other than :
  • ^([^:\n-][^:\n]*): - captures a sequence of any char but : , - and a newline followed with any 0 or more chars other than : and newline at the start of a line, and also matches a : right after (that is not part of the group and is thus not returned by re.findall ).

See a Python demo :

import re
rx = r"^[^:-][^:]*"
text = "enx002: connected to Wired connection 2\ndocker0: connected to docker0\nvirbr0: connected to virbr0\ntun0: connected to tun0"
print( re.findall(rx, text, re.M) )
# => ['enx002', 'docker0', 'virbr0', 'tun0']
print( re.findall(r'^([^:\n-][^:\n]*):', text, re.M) )
# => ['enx002', 'docker0', 'virbr0', 'tun0']

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM