Re.sub() in Python does not accept a list as an input parameter

Question

I am trying to clean a dataset and while doing so I came across a column named "production_companies" with about a 1000 values. This column contains unnecessary symbols for example: The column values are like this [{name: 'Pixar', id:"3}] . I wish to remove the unnecessary symbols like: " {} [] , the text values "name" and "id" as well as the integers.

list1=[]

list1= data.production_companies

for i in list1:

    re.sub('\d+','',list1)

The problem is that re.sub does not accept list as a parameter. It only accepts a string as an input parameter. I need to use a list to store the production_companies values and iterate through it using a for loop because there are many values in the column and I need to remove the symbols and unnecessary text from all of them at once.

Can anyone please tell me what should I do?

Thanks a lot

Answer 1

您可以使用列表推导从现有列表创建新列表。

list2 = [re.sub('\d+', '', item) for item in list1]

Re.sub() in Python does not accept a list as an input parameter

Question

1 answers

solution1
1 2018-01-16 22:57:03

Re.sub() in Python does not accept a list as an input parameter

Question

1 answers

solution1 1 2018-01-16 22:57:03

solution1
1 2018-01-16 22:57:03