简体   繁体   中英

Removing text from each element of a list in python

I have a list that has strings like ['\\t30/30','\\t10/10'] because I used regular expressions in some raw input data from a string called grades.

number = re.findall(r'\t\d{1,3}\/\d{1,3}',grades)
number[:] = [item.replace('\t', '') for item in number]

How to I remove the \\t in each element of my list "number"? The second line gives me an error:

AttributeError: 'list' object has no attribute 'replace'

I actually can not reproduce your problem. Make sure numbers is what you think it is. It seems to be list of lists instead.

>>> grades = "bla bla \t23/40 foo bar bla \t12/20"
>>> number = re.findall(r'\t\d{1,3}\/\d{1,3}',grades)
>>> [item.replace("\t", "") for item in number]
['23/40', '12/20']

Alternatively, you could use str.strip to make it a bit shorter:

>>> [item.strip() for item in number]
['23/40', '12/20']

However, I would instead suggest to use a capturing group in your regex so the \\t is not even part of the result:

>>> re.findall(r'\t(\d{1,3}\/\d{1,3})',grades)
['23/40', '12/20']

Or use one group for each of the numbers, so you don't have to split them afterwards:

>>> re.findall(r'\t(\d{1,3})\/(\d{1,3})',grades)
[('23', '40'), ('12', '20')]

You can try this:

import re
number = re.findall(r'\t\d{1,3}\/\d{1,3}',grades)
final_data = [re.findall('[^\t]+', i)[0] for i in number]

When running this code with number = ['\\t30/30','\\t10/10'] , the output is:

['30/30', '10/10']
data = ['\t30/30', '\t10/10']
no_tabs = filter(lambda x: x != '\t', data)

print(*no_tabs)
30/30   10/10

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM