简体   繁体   English

删除除逗号外的所有字符和数字

[英]Remove all the characters and numbers except comma

I am trying to remove all the characters from string in the DataFrame column but keep the comma but it still removes everything including the comma.我试图从 DataFrame 列中的字符串中删除所有字符,但保留逗号,但它仍然会删除包括逗号在内的所有内容。

I know the question has been asked before but I tried many answers and all remove the comma as well.我知道之前有人问过这个问题,但我尝试了很多答案,并且都删除了逗号。

df[new_text_field_name] = df[new_text_field_name].apply(lambda elem: re.sub(r"(@[A-Za-z0-9]+)|([^0-9A-Za-z \t])|(\w+:\/\/\S+)|^rt|http.+?", "", str(elem)))

sample text:示范文本:

'100 % polyester, Paperboard (min. 30% recycled), 100% polypropylene', '100% 涤纶,纸板(至少 30% 回收),100% 聚丙烯',

the required output:所需的 output:

' polyester, Paperboard, polypropylene', '聚酯,纸板,聚丙烯',

Possible solution is the following:可能的解决方案如下:

# pip install pandas

import pandas as pd
pd.set_option('display.max_colwidth', 200)

# set test data and create dataframe
data = {"text": ['100 % polyester, Paperboard (min. 30% recycled), 100% polypropylene','Polypropylene plastic', '100 % polyester, Paperboard (min. 30% recycled), 100% polypropylene', 'Bamboo, Clear nitrocellulose lacquer', 'Willow, Stain, Solid wood, Polypropylene plastic, Stainless steel, Steel, Galvanized, Steel, 100% polypropylene', 'Banana fibres, Clear lacquer', 'Polypropylene plastic (min. 20% recycled)']}
df = pd.DataFrame(data)

def cleanup(txt):
    re_pattern = re.compile(r"[^a-z, ()]", re.I)
    return re.sub(re_pattern, "", txt).replace("  ", " ").strip()

df['text_cleaned'] = df['text'].apply(cleanup)
df

Returns退货

在此处输入图像描述

Character.isDigit() and Character.isLetter() functions can be used to identify whether it is number or character. Character.isDigit() 和Character.isLetter() 函数可以用来识别是数字还是字符。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 正则表达式:替换除数字、特定字符和特定单词之外的所有内容 - Regex: Replace all except numbers, specific characters and specific words 如何替换除字母,数字,正斜杠和反斜杠之外的所有字符 - How to replace all characters except letters, numbers, forward and back slashes 使用python删除所有特殊字符和数字 - remove all special characters and numbers using python 删除所有特殊字符和数字并停用词 - Remove all special characters and numbers and stop words Python3正则表达式:删除/和|以外的所有字符 从字符串 - Python3 Regex: Remove all characters except / and | from string 如何使用 python 中的正则表达式删除除某些特殊字符外的所有特殊字符 - How to remove all special characters except for some, using regex in python 有没有办法在Python中删除字符串中除字母之外的所有字符? - Is there a way to remove all characters except letters in a string in Python? 正则表达式-删除所有特殊字符(字母数字和重音符号除外) - Regular Expression - Remove all special characters except alphanumeric and accents 删除所有数字,除了使用 python regex 组合成字符串的数字 - Remove all numbers except for the ones combined to string using python regex 如何删除列表中除 1 个数字(Python)之外的所有数字? - How can I remove all the numbers in a list except for 1 number (Python)?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM