我如何使用正则表达式获取两个字符内的字符串并删除该字符串内的某些字符

Question

I have a long string that I want to filter using regex我有一个很长的字符串，我想使用正则表达式进行过滤

<@961483653468439706> Text to remove, this text is useless, that's why i want it gone!
i want this: `keep the letters and spaces`

I want to keep the text that in-between the ` characters我想保留 ` 字符之间的文本

only issue is that in-between every character in the part of the string I want there is an invisible character.唯一的问题是在我想要的字符串部分中的每个字符之间都有一个不可见的字符。 you can see the invisible characters in regex101: https://regex101.com/r/rAYrMT/1你可以在regex101中看到不可见的字符： https ://regex101.com/r/rAYrMT/1

`([\'^\w]*)`

So in short: keep everything between ` except for the invisible characters info on which can be found here: https://apps.timwhitlock.info/unicode/inspect?s=%EF%BB%BF简而言之：将所有内容保留在` 之间，除了可以在此处找到的不可见字符信息： https : //apps.timwhitlock.info/unicode/inspect?s=%EF%BB%BF

Answer 1

You can filter the non printable characters out:您可以过滤掉不可打印的字符：

import re 
from string import printable

# your invisibles are in the string...

s='''<@961483653468439706> Text to remove, this text is useless, that's why i want it gone!
Type `keep the letters and spaces` and `this too`'''

for m in re.findall(r'`([^`]*)`', s):
    print(repr(m))
    print(''.join([c for c in m if c in printable]))
    print()

Prints:印刷：

'k\ufeffe\ufeffe\ufeffp\ufeff \ufefft\ufeffh\ufeffe\ufeff \ufeffl\ufeffe\ufefft\ufefft\ufeffe\ufeffr\ufeffs a\ufeffn\ufeffd s\ufeffp\ufeffa\ufeffc\ufeffe\ufeffs'
keep the letters and spaces

'this too'
this too

Answer 2

You don't need to use regex for this:您不需要为此使用正则表达式：

text = "<@961483653468439706> Text to remove, this text is useless, that's " \
       "why i want it gone!Type `keep the letters and spaces`"

# put your invisible character between the first quotation marks here. obviously, they
# don't show up in this post.
filtered = text.replace('', '')
# because the passage you want is always between ``, you can split it and know that every
# second item in the list that split returns must be what you are looking for. 
passage = filtered.split('`')[::2]

print(passage)

我如何使用正则表达式获取两个字符内的字符串并删除该字符串内的某些字符

问题描述

2 个解决方案

解决方案1
1 已采纳 2020-12-25 23:23:08

解决方案2
0 2020-12-26 10:08:23

我如何使用正则表达式获取两个字符内的字符串并删除该字符串内的某些字符

问题描述

2 个解决方案

解决方案1 1 已采纳 2020-12-25 23:23:08

解决方案2 0 2020-12-26 10:08:23

解决方案1
1 已采纳 2020-12-25 23:23:08

解决方案2
0 2020-12-26 10:08:23