简体   繁体   中英

How to delete u'\n\n\n\n\n\n\n\n\n' and u'\xa0' from a python list

I have been struggling for two days but can not figure it out. Here is my code :

def find_name():
    i = 0 
    while i != len(links):
        url = links[i]
        r = requests.get(url)
        html = r.content
        soup = BeautifulSoup(html)
        for n in soup.find_all('tr'):
            td = n.find('td')
            if td: 
                last_name.append(td.text)
        i = i+1 
    del last_name[0:5]
    return last_name

It generates a list of Last names, but there are multiple u'\\xa0' and '/ u'\\n\\n\\n\\n\\n in the list I want them gone. I tried everything I knew. Like removing it by checking each element but that gives me value error list.remove(x): x not in list , I also tried to compare each element to - u'\\n\\n\\n\\n\\n\\n\\n\\n\\n' and then add to the list. But it did not work. There are other questions on stackoverflow but they all talk about the string.

You could call str.strip() on the text before adding it to the last_name list.

          if td and td.text.strip(): 
              last_name.append(td.text)

You could use a list comprehension and the strip method:

# Your code
last_name = [name for name in last_name if name.strip()]
return last_name

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM