简体   繁体   中英

why this regex is not capture before space?

text = """group1{
    element : 1
    element : 2
}
"""
for space, char in re.findall(r'^((?:\s{4})*)([^\s]+)', text, re.MULTILINE):
    print(space, char)

this code get this result

 group1{
     element
     element
 }

I want to get result

 group1{
     element : 1
     element : 2
 }

please help me.. why is not capture this regex before space?

Your second group pattern, ([^\s]+) , that is equal to (\S+) , matches any one or more non-whitespace chars. You need to replace it with (\S.*) to capture a non-whitespace char that is followed with any zero or more chars other than line break chars.

Also, \s matches line breaks, and to get the proper indentation, you need to exclude \n at least from \s , so you need to replace \s{4} with [^\S\n]{4} .

You need to use

^((?:[^\S\n]{4})*)(\S.*)

See this regex demo and the Python demo .

Details :

  • ^ - start of a line
  • ((?:[^\S\n]{4})*) - Group 1: any zero or more occurrences of four whitespace chars excluding a newline (line feed) char
  • (\S.*) - Group 2: a non-whitespace and then any zero or more chars other than line break chars as many as possible.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM