[英]Replace special characters with "N/A" in Python
我想將所有僅包含表情符號(例如df['Comments'][2]
行更改為 N/A。
df['Comments'][:6]
0 nice
1 Insane3
2 😻😻❤️
3 @bertelsen1986
4 20 or 30 mm rise on the Renthal Fatbar?
5 Luckily I have one to 🔥💪🏻
以下代碼不會返回我期望的 output:
df['Comments'].replace(';', ':', '!', '*', np.NaN)
預計 Output:
df['Comments'][:6]
0 nice
1 Insane3
2 nan
3 @bertelsen1986
4 20 or 30 mm rise on the Renthal Fatbar?
5 Luckily I have one to 🔥💪🏻
您可以通過迭代每行中的 unicode 個字符來檢測僅包含表情符號的行(使用emoji和unicodedata包):
df = {}
df['Comments'] = ["Test", "Hello 😉", "😉😉😉"]
import unicodedata
import numpy as np
from emoji import UNICODE_EMOJI
for i in range(len(df['Comments'])):
pure_emoji = True
for unicode_char in unicodedata.normalize('NFC', df['Comments'][i]):
if unicode_char not in UNICODE_EMOJI:
pure_emoji = False
break
if pure_emoji:
df['Comments'][i] = np.NaN
print(df['Comments'])
Function(remove_emoji)參考https://stackoverflow.com/a/61839832/6075699
嘗試
安裝第一個emoji
庫 - pip install emoji
import re
import emoji
df.Comments.apply(lambda x: x if (re.sub(r'(:[!_\-\w]+:)', '', emoji.demojize(x)) != "") else np.nan)
0 nice
1 Insane3
2 NaN
3 @bertelsen1986
4 Luckily I have one to 🔥💪🏻
Name: a, dtype: object
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.