简体   繁体   English

从python列表中替换特殊字符

[英]Replace special characters from list in python

How do I replace special characters (emoticons) with a given feature. 如何使用给定功能替换特殊字符(表情符号)。

For example 例如

emoticons = \
    [   ('__EMOT_SMILEY',   [':-)', ':)', '(:', '(-:', ] )  ,\
        ('__EMOT_LAUGH',        [':-D', ':D', 'X-D', 'XD', 'xD', ] )    ,\
        ('__EMOT_LOVE',     ['<3', ':\*', ] )   ,\
        ('__EMOT_WINK',     [';-)', ';)', ';-D', ';D', '(;', '(-;', ] ) ,\
        ('__EMOT_FROWN',        [':-(', ':(', ] )   ,\
        ('__EMOT_CRY',      [':,(', ':\'(', ':"(', ':(('] ) ,\
    ]

msg = 'I had a beautiful day :)'

output desired 期望的输出

>> I had a beautiful day __EMOT_SMILEY

I know how to do it with a dict, but here I have multiple values associated to each feature 我知道如何使用字典来实现,但是在这里我有与每个功能关联的多个值

The following code will not work in this case 以下代码在这种情况下不起作用

for emote, replacement in emoticons.items():
  msg = msg.replace(emote, replacement)

You could use a dictionary and a regex : 您可以使用字典 正则表达式

import re

def replace(msg, emoticons):
    d = {r: emote for emote, replacement in emoticons for r in replacement}
    pattern = "|".join(map(re.escape, d))
    msg = re.sub(pattern, lambda match: d[match.group()], msg)
    return msg

print(replace(msg, emoticons))  # I had a beautiful day __EMOT_SMILEY

This oughta do it: 这应该做到:

emoticons = [   ('__EMOT_SMILEY',   [':-)', ':)', '(:', '(-:', ] ),
        ('__EMOT_LAUGH',    [':-D', ':D', 'X-D', 'XD', 'xD', ] ),
        ('__EMOT_LOVE',     ['<3', ':\*', ] ),
        ('__EMOT_WINK',     [';-)', ';)', ';-D', ';D', '(;', '(-;', ] ),
        ('__EMOT_FROWN',        [':-(', ':(', '(:', '(-:', ] ),
        ('__EMOT_CRY',      [':,(', ':\'(', ':"(', ':(('] )
    ]

emoticons = dict(emoticons)    
emoticons = {v: k for k in emoticons for v in emoticons[k]}

msg = 'I had a beautiful day :)'

for item in emoticons:
    if item in msg:
        msg = msg.replace(item, emoticons[item])

So, you crate a dict, invert it and replace all the emoticons that exist in sentence. 因此,您创建了一个字典,将其反转并替换了句子中存在的所有图释。

Try this instead: 尝试以下方法:

emoticons = [
    ('__EMOT_SMILEY', [':-)', ':)', '(:', '(-:',]),
    ('__EMOT_LAUGH',  [':-D', ':D', 'X-D', 'XD', 'xD',]),
    ('__EMOT_LOVE',   ['<3', ':\*',]),
    ('__EMOT_WINK',   [';-)', ';) ', ';-D', ';D', '(;', '(-;',]),
    ('__EMOT_FROWN',  [':-(', ':(', '(:', '(-:',]),
    ('__EMOT_CRY',    [':,(', ':\'(', ':"(', ':((',]),
]

msg = 'I had a beautiful day :)'

for key, replaceables in dict(emoticons).items():
  for replaceable in replaceables:
    msg = msg.replace(replaceable, key)

print(msg)
>>> I had a beautiful day __EMOT_SMILEY
emoticons = [   ('__EMOT_SMILEY',   [':-)', ':)', '(:', '(-:', ] )  ,
    ('__EMOT_LAUGH',        [':-D', ':D', 'X-D', 'XD', 'xD', ] )    ,
    ('__EMOT_LOVE',     ['<3', ':\*', ] )   ,
    ('__EMOT_WINK',     [';-)', ';)', ';-D', ';D', '(;', '(-;', ] ) ,
    ('__EMOT_FROWN',        [':-(', ':(', '(:', '(-:', ] )  ,
    ('__EMOT_CRY',      [':,(', ':\'(', ':"(', ':(('] ) ,
]


msg = 'I had a beautiful day :)'

for emote, replacement in emoticons:
     for symbol in replacement:
         msg = msg.replace(symbol,emote)

print(msg)

How about this: 这个怎么样:

emoticons = [('__EMOT_SMILEY',   [':-)', ':)', '(:', '(-:']),
             ('__EMOT_LAUGH',    [':-D', ':D', 'X-D', 'XD', 'xD']),
             ('__EMOT_LOVE',     ['<3', ':\*']),
             ('__EMOT_WINK',     [';-)', ';)', ';-D', ';D', '(;', '(-;']),
             ('__EMOT_FROWN',    [':-(', ':(', '(:', '(-:']),
             ('__EMOT_CRY',      [':,(', ':\'(', ':"(', ':(('])]

msg = 'I had a beautiful day :)'

grabs = set([x for _, y in emoticons for x in y[1]])

for word in [x for x in msg.split() if all(y in grabs for y in x)]:
    for emot_code, search_patterns in emoticons:
        if word in search_patterns:
            msg = msg.replace(word, emot_code)
print(msg)  # I had a beautiful day __EMOT_SMILEY

Instead of trying to find any of the emoticons in the msg to replace them, it first searches for substrings that might be emoticons and tries to replaces those only. 它没有尝试在msg找到任何表情符号来替换它们,而是先搜索可能是表情符号的子字符串,然后尝试仅替换那些表情符号

That said, it does fail for cases with punctuation right after or before the emoticons; 就是说,在表情符号之前或之后使用标点符号的情况确实会失败; eg, "I had a beautiful day :)." 例如, "I had a beautiful day :)."

So all in all.. "__EMOT_FROWN" 因此,总而言之。。 "__EMOT_FROWN"

There are plenty of answers giving you exactly what you asked for, but sometimes I think exactly what you asked for isn't the best solution. 有很多答案可以为您提供所需的确切信息,但有时我认为您所要求的并不是最佳解决方案。 Like tobias_k said, the cleanest solution is to map many keys to the same value, essentially "reversing" your dictionary: 就像tobias_k所说,最干净的解决方案是将许多键映射到相同的值,本质上是“反转”您的字典:

emoticons = \
    [   ('__EMOT_SMILEY',   [':-)', ':)', '(:', '(-:', ] )  ,\
        ('__EMOT_LAUGH',        [':-D', ':D', 'X-D', 'XD', 'xD', ] )    ,\
        ('__EMOT_LOVE',     ['<3', ':\*', ] )   ,\
        ('__EMOT_WINK',     [';-)', ';)', ';-D', ';D', '(;', '(-;', ] ) ,\
        ('__EMOT_FROWN',        [':-(', ':(', '(:', '(-:', ] )  ,\
        ('__EMOT_CRY',      [':,(', ':\'(', ':"(', ':(('] ) ,\
    ]

emote_dict = {emote: name for name, vals in emoticons for emote in vals}

The above code reverses the dictionary, so now it can be used like this: 上面的代码反转了字典,因此现在可以像这样使用它:

>>>print(emote_dict[':)'])
_EMOT_SMILY

You can try using a dict, This should work as long as you only have 2 or 3 chars in your emoticons and the person uses a space... Im sure you can make it more robust but this will work for now. 您可以尝试使用dict,只要表情符号中只有2个或3个字符,并且该人使用空格,此方法就可以工作。我确定您可以使其更强大,但现在可以使用。

emoticons = {
    '__EMOT_SMILEY': {':-)', ':)', '(:', '(-:'},
    '__EMOT_LAUGH' : {':-D', ':D', 'X-D', 'XD', 'xD'},
    '__EMOT_LOVE' : {'<3', ':\*'},
    '__EMOT_WINK' :{';-)', ';)', ';-D', ';D', '(;', '(-;'},
    '__EMOT_FROWN' : {':-(', ':(', '(:', '(-:'},
    '__EMOT_CRY' : {':,(', ':\'(', ':"(', ':(('}
        }

msg = 'I had a beautiful day :,('
img = msg[-3]
if img[0]==' ':
    img = msg[-2:]
else:
    img = msg[-3:]

for k, v in emoticons.items():
    if img in v:
        print(msg[:-3], k)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM