拆分字符串并将 Discord 表情符号替换为 [name]

Question

我有传入消息，例如<a:GG:123456789> <:1Copy:12345678><:14:1256678>:eyes:Hello friend::eyes:我希望这个 output 是[GG] [1Copy][14][eyes]Hello friend![eyes]

下面的代码是我目前拥有的，它的工作原理是. 上面传入的例子输出[GG] [1Copy] [14] [eyes]

def shorten_emojis(content):
    seperators = ("<a:", "<:")

    output = []

    for chunk in content.split():
        if any(match in chunk for match in seperators):
            parsed_chunk = []

            new_chunk = chunk.replace("<", ";<").replace(">", ">;")

            for emo in new_chunk.split(";"):
                if emo.startswith(seperators):
                    emo = f"<{splits[1]}>" if len(splits := emo.split(":")) == 3 else emo

                parsed_chunk.append(emo)

            chunk = "".join(parsed_chunk)

        output.append(chunk)

    output = " ".join(output)

    for e in re.findall(":.+?:", content):
        output = output.replace(e, f"<{e.replace(':', '')}>")

    return output

测试#1

输入： <a:GG:123456789> <:1Copy:12345678><:14:1256678>:eyes:Hello friend::eyes:

Output: [GG] [1Copy] [14]:eyes:Hello friend::eyes:

[GG] [1Copy][14][eyes]Hello friend![eyes]

测试#2

Input: <a:cryLaptop:738450655395446814><:1Copy:817543814481707030><:14:817543815401439232> <:thoonk:621279654711656448><:coolbutdepressed:621279653675532290><:KL1Heart:585547199480332318>Nice<:dogwonder:621251869058269185> OK:eyes:

Output: [cryLaptop] [1Copy] [14] [thoonk] [coolbutdepressed] [KL1Heart] Nice [dogwonder] OK:eyes:

渴望[cryLaptop] [GG] [1Copy] [14] [thoonk] [coolbutdepressed] [KL1Heart] Nice [dogwonder] OK[eyes]

编辑

我已经编辑了我的代码块，它现在可以按需要工作。

Answer 1

您可以使用正则表达式来做到这一点。 它是一个已经包含 Python 本身的库。

我对代码进行了一些修改以使其更紧凑，但我认为它的理解是一样的。

最重要的是检测这三组词。 用(<. *?>)我们 select 的<words> ，用(:. *? :) : word:和用(. *?)的 rest 的文字。

然后我们必须使用预期值对其进行格式化并显示它们。

import re
def shorten_emojis(content):
    tags = re.findall('((<.*?>)|(:.*?:)||(.*?))', content)
    output=""
    for tag in tags:
        if re.findall("<.*?>", tag[0]):
            valor=re.search(':.*?:', tag[0])
            output+=f"[{valor.group()[1:-1]}]"
        elif re.match(":.*?:", tag[0]):
            output+=f"[{tag[0][1:-1]}]"
        else:
            output+=f"{tag[0]}"

    return output


print(shorten_emojis("<a:GG:123456789> <:1Copy:12345678><:14:1256678>:eyes:Hello friend!:eyes:"))
print(shorten_emojis("<a:cryLaptop:738450655395446814><:1Copy:817543814481707030><:14:817543815401439232> <:thoonk:621279654711656448><:coolbutdepressed:621279653675532290><:KL1Heart:585547199480332318>Nice<:dogwonder:621251869058269185> OK:eyes:"))

结果：

[GG] [1Copy][14][eyes]Hello friend![eyes]
[cryLaptop][1Copy][14] [thoonk][coolbutdepressed][KL1Heart]Nice[dogwonder] OK[eyes]

Answer 2

您可能会使用带有交替的单一模式| 以匹配这两种变体。 然后在 sub 的回调中，可以检查组 1 的存在。

<a?:([^:<>]+)[^<>]*>|:([^:]+):

模式匹配

<a?:匹配< ，可选a和:
([^:<>]+)在组 1中捕获除: <和>之外的任何字符
[^<>]*>可选匹配除<和>之外的任何字符，然后匹配>
| 或者
:([^:]+):在第 2 组中全部捕获:

请参阅正则表达式演示和Python 演示。

例如

import re

pattern = r"<a?:([^:<>]+)[^<>]*>|:([^:]+):"
def shorten_emojis(content):
    return re.sub(
        pattern, lambda x: f"[{x.group(1)}]" if x.group(1) else f"[{x.group(2)}]"
        ,content
    )

print(shorten_emojis("<a:GG:123456789> <:1Copy:12345678><:14:1256678>:eyes:Hello friend!:eyes:"))
print(shorten_emojis("<a:cryLaptop:738450655395446814><:1Copy:817543814481707030><:14:817543815401439232> <:thoonk:621279654711656448><:coolbutdepressed:621279653675532290><:KL1Heart:585547199480332318>Nice<:dogwonder:621251869058269185> OK:eyes:"))

Output

[GG] [1Copy][14][eyes]Hello friend![eyes]
[cryLaptop][1Copy][14] [thoonk][coolbutdepressed][KL1Heart]Nice[dogwonder] OK[eyes]

拆分字符串并将 Discord 表情符号替换为 [name]

问题描述

测试#1

测试#2

编辑

2 个解决方案

解决方案1
1 2021-03-11 14:23:33

解决方案2
1 已采纳 2021-03-11 17:05:04

拆分字符串并将 Discord 表情符号替换为 [name]

问题描述

测试#1

测试#2

编辑

2 个解决方案

解决方案1 1 2021-03-11 14:23:33

解决方案2 1 已采纳 2021-03-11 17:05:04

解决方案1
1 2021-03-11 14:23:33

解决方案2
1 已采纳 2021-03-11 17:05:04