简体   繁体   English

将Python中的特殊字符替换为“N/A”

[英]Replace special characters with "N/A" in Python

I would like to change all rows with only emojis such as df['Comments'][2] to N/A.我想将所有仅包含表情符号(例如df['Comments'][2]行更改为 N/A。

df['Comments'][:6]
0                                                          nice
1                                                       Insane3
2                                                          😻😻❤️
3                                                @bertelsen1986
4                       20 or 30 mm rise on the Renthal Fatbar?
5                                     Luckily I have one to 🔥💪🏻

The following code doesn't return the output I expect:以下代码不会返回我期望的 output:

df['Comments'].replace(';', ':', '!', '*', np.NaN)

Expected Output:预计 Output:

df['Comments'][:6]
0                                                          nice
1                                                       Insane3
2                                                          nan
3                                                @bertelsen1986
4                       20 or 30 mm rise on the Renthal Fatbar?
5                                     Luckily I have one to 🔥💪🏻

You can detect lines containing only emojis by iterating over the unicode characters in each line (using the emoji and unicodedata packages):您可以通过迭代每行中的 unicode 个字符来检测包含表情符号的行(使用emojiunicodedata包):

df = {}
df['Comments'] = ["Test", "Hello 😉", "😉😉😉"]

import unicodedata
import numpy as np
from emoji import UNICODE_EMOJI
for i in range(len(df['Comments'])):
    pure_emoji = True
    for unicode_char in unicodedata.normalize('NFC', df['Comments'][i]):
        if unicode_char not in UNICODE_EMOJI:
            pure_emoji = False
            break
    if pure_emoji:
        df['Comments'][i] = np.NaN
print(df['Comments'])

Function (remove_emoji) reference https://stackoverflow.com/a/61839832/6075699 Function(remove_emoji)参考https://stackoverflow.com/a/61839832/6075699

Try尝试
Install first emoji lib - pip install emoji安装第一个emoji库 - pip install emoji

import re
import emoji

df.Comments.apply(lambda x: x if (re.sub(r'(:[!_\-\w]+:)', '', emoji.demojize(x)) != "") else np.nan)
0                         nice
1                      Insane3
2                          NaN
3               @bertelsen1986
4    Luckily I have one to 🔥💪🏻
Name: a, dtype: object

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM