简体   繁体   English

Python中的Re.sub()不接受列表作为输入参数

[英]Re.sub() in Python does not accept a list as an input parameter

I am trying to clean a dataset and while doing so I came across a column named "production_companies" with about a 1000 values. 我正在尝试清理数据集,而这样做的时候,我遇到了一个名为“ production_companies”的列,该列具有大约1000个值。 This column contains unnecessary symbols for example: The column values are like this [{name: 'Pixar', id:"3}] . I wish to remove the unnecessary symbols like: " {} [] , the text values "name" and "id" as well as the integers. 例如,该列包含不必要的符号:列值类似[{name: 'Pixar', id:"3}] 。我希望删除不必要的符号,例如:” {} [],文本值“ name”和“ id”以及整数。

list1=[]

list1= data.production_companies

for i in list1:

    re.sub('\d+','',list1)

The problem is that re.sub does not accept list as a parameter. 问题是re.sub不接受list作为参数。 It only accepts a string as an input parameter. 它仅接受字符串作为输入参数。 I need to use a list to store the production_companies values and iterate through it using a for loop because there are many values in the column and I need to remove the symbols and unnecessary text from all of them at once. 我需要使用一个列表来存储production_companies值,并使用for循环对其进行遍历,因为该列中有很多值,并且需要一次从所有它们中删除符号和不必要的文本。

Can anyone please tell me what should I do? 谁能告诉我该怎么办?

Thanks a lot 非常感谢

您可以使用列表推导从现有列表创建新列表。

list2 = [re.sub('\d+', '', item) for item in list1]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM