简体   繁体   English

正则表达式在开头和结尾突出显示新行字符

[英]regex to highlight new line characters in the beginning and end

I am trying to figure out how to write a simple regex that would highlight newline characters only if they appear at the beginning or end of some data while preserving the newline.我试图弄清楚如何编写一个简单的正则表达式,该正则表达式仅在换行符出现在某些数据的开头或结尾时才会突出显示,同时保留换行符。

In the below example, line 1 and line 14 both are new lines.在下面的示例中,第 1 行和第 14 行都是新行。 Those are the only two lines I am trying to highlight as they appear at the beginning and end of the data.这是我试图突出显示的仅有的两行,因为它们出现在数据的开头和结尾。


import regex as re
from colorama import Fore, Back

def red(s):
    return Back.RED + s + Back.RESET

with open('/tmp/1.py', 'r') as f:
    data = f.read()

print(
    re.sub(r'(^\n|\n$)', red(r'\1'), data)
)

In the open expression, data is the same content as the example posted above.在 open 表达式中,数据与上面发布的示例内容相同。

In the above example, this is the result I am getting:在上面的例子中,这是我得到的结果:

在此处输入图片说明

As one can see, the red highlight is missing on line 1 and is spanning all the way in line 14. What I would like is for the color to appear only once per new line character.如您所见,第 1 行缺少红色突出显示,并一直跨越第 14 行。我希望颜色在每个新行字符中只出现一次。

You can actually use your regex, but without the "multiline" flag.您实际上可以使用正则表达式,但没有“多行”标志。 Than it will see the whole string as one and you will actually match your desired output.然后它将整个字符串视为一个,您实际上将匹配您想要的输出。

^\n|\n$

Here you can see that there are two matches.在这里您可以看到有两个匹配项。 And if you delete new lines in front or in the end, the matches will disapear.如果您删除前面或最后的新行,匹配将消失。 The multilene flag is set or disabled at the end of the regex line.在正则表达式行的末尾设置或禁用 multilene 标志。 You could do that in your language too.你也可以用你的语言做到这一点。

https://regex101.com/r/pSRHPU/2 https://regex101.com/r/pSRHPU/2

After reading all the comments, and suggestions, and combining a subset of them all, I finally have a working version.在阅读了所有的评论和建议,并结合了其中的一部分之后,我终于有了一个工作版本。 For anyone that is interested:对于任何有兴趣的人:

One issue I cannot overcome without writing an os specific check is how an extra new line being added for windows.如果不编写特定于操作系统的检查,我无法克服的一个问题是如何为 Windows 添加额外的新行。

A couple of highlights that were picked up:挑选的几个亮点:

  • cannot color a \\n .不能为\\n着色。 So replace that with a space and newline.所以用空格和换行符替换它。
  • have not tested this, but by getting rid of the group replacement, it may be possible to apply this to bytes also.尚未对此进行测试,但是通过摆脱组替换,也可以将其应用于字节。
  • Windows supported can be attained with init in colorama可以通过 colorama 中的init获得支持的 Windows

import regex as re
from colorama import Back, init

init() # for windows

def red(s):
    return Back.RED + s + Back.RESET

with open('/tmp/1.py', 'r') as f:
    data = f.read()

fist_line = re.sub('\A\n', red(' ')+'\n', data)
last_line = re.sub('\n\Z', '\n'+red(' '), fist_line)
print(last_line)

OSX/Linux操作系统/Linux

在此处输入图片说明

Windows视窗

在此处输入图片说明

I found a way that seems to allow you to match the start/end of the whole string.我找到了一种似乎可以让您匹配整个字符串的开始/结束的方法。 See the "Permanent Start of String and End of String Anchors" part from https://www.regular-expressions.info/anchors.html请参阅https://www.regular-expressions.info/anchors.html 中的“字符串的永久开头和字符串锚点的结尾”部分

\\A only ever matches at the start of the string. \\A只在字符串的开头匹配。 Likewise, \\Z only ever matches at the end of the string.同样, \\Z只在字符串的末尾匹配。

I created a demo here https://regex101.com/r/n2DAWh/1我在这里创建了一个演示https://regex101.com/r/n2DAWh/1

Regex is: (\\A\\n|\\n\\Z)正则表达式为: (\\A\\n|\\n\\Z)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM