简体   繁体   中英

Python Error - AttributeError: 'NoneType' object has no attribute 'split'

Here is the error:

File "f**.py", line 34, in module

url_type = url.split('-')[0][-2:] #

Here is the whole block:

fit_urls = []
for event_url in event_urls: 
  print event_url
  try:
    sock = urllib.urlopen(event_url)
    event_html = sock.read()
    event_soup = BeautifulSoup(event_html)

    tds = event_soup.find_all('td')
    for td in tds:
        for link in td.find_all('a'):
            url = link.get('href')
            url_type = url.split('-')[0][-2:] letters
            if url_type == 'ht': 
                #print url
                fit_urls.append(url)

except HTTPError:
    pass

`

That is because any of your 'link' is not having the 'href' attribute. You may verify it by adding print link before doing url = link.get('href') .

In order to fix this, you may add a additional if check to filter such links as:

for td in tds:
    for link in td.find_all('a'):
        url = link.get('href')
        if url:   # additional check. will be `False` when `'url'` will be `None`
            url_type = url.split('-')[0][-2:] letters
            # Your rest of the code

It looks like url = link.get('href') is returning None . You can check for None in your loop:

for td in tds:
    for link in td.find_all('a'):
        url = link.get('href')
        if not url:
            continue
        url_type = url.split('-')[0][-2:] letters
        if url_type == 'ht': 
            #print url
            fit_urls.append(url)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM