ValueError：索引79處不支持的格式字符'a'（0x61）

Question

我正在嘗試使用漂亮的soup4和python從網站上抓取數據。 這是我的代碼

from bs4 import BeautifulSoup
import urllib2
i = 0
for i in xrange(0,38):
    page=urllib2.urlopen("http://www.sfap.org/klsfaprep_search?page={}&type=1&strname=&loc=&op=Lancer%20la%20recherche&form_build_id=form-72a297de309517ed5a2c28af7ed15208&form_id=klsfaprep_search_form" %i) 
    soup = BeautifulSoup(page.read())
    for eachuniversity in soup.findAll('div',{'class':'field-item odd'}):
        print ''.join(eachuniversity.findAll(text=True)).encode('utf-8')
    print ',\n'
i= i+ 1

我認為問題出在我給定的URL和遞增聲明中。 我能夠逐頁抓取。 但是只有當我給xrange時。

Answer 1

`ValueError`原因

您正在將{}格式與%格式混合使用。

>>> '{}%20la' % 1
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: unsupported format character 'a' (0x61) at index 6
>>> '{}%20la'.format(1)
'1%20la'

我建議您使用{}格式，因為在URL中有多個% s。

page=urllib2.urlopen("http://www.sfap.org/klsfaprep_search?page={}&type=1&strname=&loc=&op=Lancer%20la%20recherche&form_build_id=form-72a297de309517ed5a2c28af7ed15208&form_id=klsfaprep_search_form".format(i))

完整的代碼

您不需要i = 0和i = i + 1因為for i in xrange(0,38)關照。

import urllib2 # Import standard library module first. (PEP-8)

from bs4 import BeautifulSoup

for i in xrange(0,38):
    page = urllib2.urlopen("http://www.sfap.org/klsfaprep_search?page={}&type=1&strname=&loc=&op=Lancer%20la%20recherche&form_build_id=form-72a297de309517ed5a2c28af7ed15208&form_id=klsfaprep_search_form" .format(i))
    soup = BeautifulSoup(page.read())
    for eachuniversity in soup.findAll('div',{'class':'field-item odd'}):
        print ''.join(eachuniversity.findAll(text=True)).encode('utf-8')
    print ',\n'

ValueError：索引79處不支持的格式字符'a'（0x61）

問題描述

1 個解決方案

解決方案1
3 已采納 2013-10-19 04:55:56

`ValueError`原因

完整的代碼

ValueError：索引79處不支持的格式字符&#39;a&#39;（0x61）

問題描述

1 個解決方案

解決方案1 3 已采納 2013-10-19 04:55:56

ValueError原因

完整的代碼

ValueError：索引79處不支持的格式字符'a'（0x61）

解決方案1
3 已采納 2013-10-19 04:55:56

`ValueError`原因