简体   繁体   中英

regex to highlight new line characters in the beginning and end

I am trying to figure out how to write a simple regex that would highlight newline characters only if they appear at the beginning or end of some data while preserving the newline.

In the below example, line 1 and line 14 both are new lines. Those are the only two lines I am trying to highlight as they appear at the beginning and end of the data.


import regex as re
from colorama import Fore, Back

def red(s):
    return Back.RED + s + Back.RESET

with open('/tmp/1.py', 'r') as f:
    data = f.read()

print(
    re.sub(r'(^\n|\n$)', red(r'\1'), data)
)

In the open expression, data is the same content as the example posted above.

In the above example, this is the result I am getting:

在此处输入图片说明

As one can see, the red highlight is missing on line 1 and is spanning all the way in line 14. What I would like is for the color to appear only once per new line character.

You can actually use your regex, but without the "multiline" flag. Than it will see the whole string as one and you will actually match your desired output.

^\n|\n$

Here you can see that there are two matches. And if you delete new lines in front or in the end, the matches will disapear. The multilene flag is set or disabled at the end of the regex line. You could do that in your language too.

https://regex101.com/r/pSRHPU/2

After reading all the comments, and suggestions, and combining a subset of them all, I finally have a working version. For anyone that is interested:

One issue I cannot overcome without writing an os specific check is how an extra new line being added for windows.

A couple of highlights that were picked up:

  • cannot color a \\n . So replace that with a space and newline.
  • have not tested this, but by getting rid of the group replacement, it may be possible to apply this to bytes also.
  • Windows supported can be attained with init in colorama

import regex as re
from colorama import Back, init

init() # for windows

def red(s):
    return Back.RED + s + Back.RESET

with open('/tmp/1.py', 'r') as f:
    data = f.read()

fist_line = re.sub('\A\n', red(' ')+'\n', data)
last_line = re.sub('\n\Z', '\n'+red(' '), fist_line)
print(last_line)

OSX/Linux

在此处输入图片说明

Windows

在此处输入图片说明

I found a way that seems to allow you to match the start/end of the whole string. See the "Permanent Start of String and End of String Anchors" part from https://www.regular-expressions.info/anchors.html

\\A only ever matches at the start of the string. Likewise, \\Z only ever matches at the end of the string.

I created a demo here https://regex101.com/r/n2DAWh/1

Regex is: (\\A\\n|\\n\\Z)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM