简体   繁体   中英

Re.sub() in Python does not accept a list as an input parameter

I am trying to clean a dataset and while doing so I came across a column named "production_companies" with about a 1000 values. This column contains unnecessary symbols for example: The column values are like this [{name: 'Pixar', id:"3}] . I wish to remove the unnecessary symbols like: " {} [] , the text values "name" and "id" as well as the integers.

list1=[]

list1= data.production_companies

for i in list1:

    re.sub('\d+','',list1)

The problem is that re.sub does not accept list as a parameter. It only accepts a string as an input parameter. I need to use a list to store the production_companies values and iterate through it using a for loop because there are many values in the column and I need to remove the symbols and unnecessary text from all of them at once.

Can anyone please tell me what should I do?

Thanks a lot

您可以使用列表推导从现有列表创建新列表。

list2 = [re.sub('\d+', '', item) for item in list1]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM