UnicodeEncodeError：'ascii'編解碼器無法在位置11編碼字符u'\\ xb0'：序數不在范圍內（128）

Question

我正在學習如何使用Nathan Yau的《 Visualize This》一書來抓取數據。 我正在嘗試抓取2009年的Wunderground，但出現此錯誤。 這是說它超出范圍，但我不知道為什么。

line 43, in <module>
    f.write(timestamp + ',' + dayTemp + '\n')
UnicodeEncodeError: 'ascii' codec can't encode character u'\xb0' in position 11: ordinal not in range(128)

下面是我的代碼：

import sys
import urllib2
from bs4 import BeautifulSoup as BS

# Create/open a file called wunder.txt (which will be a comma-delimited file)
f = open('wunder-data.txt', 'w')

# Iterate through months and day
for m in range(1, 13):
  for d in range(1, 32):

  # Check if already gone through month
    if (m == 2 and d > 28):
     break
    elif (m in [4, 6, 9, 11] and d > 30):
     break

  # Open wunderground.com url
  url = "http://www.wunderground.com/history/airport/KBUF/2009/" + str(m) + "/" + str(d) + "/DailyHistory.html"
  page = urllib2.urlopen(url)

  # Get temperature from page
  soup = BS(page,"html.parser")
  # dayTemp = soup.body.nobr.b.string
  dayTemp = soup.find("span", text="Mean Temperature").parent.find_next_sibling("td").get_text(strip=True)

  # Format month for timestamp
  if len(str(m)) < 2:
    mStamp = '0' + str(m)
  else:
    mStamp = str(m)

  # Format day for timestamp
  if len(str(d)) < 2:
    dStamp = '0' + str(d)
  else:
    dStamp = str(d)

  # Build timestamp
  timestamp = '2009' + mStamp + dStamp

  # Write timestamp and temperature to file
  f.write(timestamp + ',' + dayTemp + '\n')

# Done getting data! Close file.
f.close()

Answer 1

問題是度數符號。 那就是你的u'\\xb0'字符。

juanpa.arrivillaga的注釋正確，您應該使用文件編碼。 在Python 2中最簡單的方法是：

from codecs import open

然后就可以了：

open('wunder-data.txt', 'w', encoding='utf8')

我擔心這不是唯一會困擾您的Unicode或非ASCII編碼問題。 現在的世界是Unicode，而Python 3在處理Unicode方面要好得多。 可以在Python 2中完成，但是需要更多的關注和關注。 但是， codecs模塊應該使您擺脫緊迫的困境。

Answer 2

在此函數調用中：

f.write(timestamp + ',' + dayTemp + '\n')

timestamp ， ','和'\\n'是str對象，而dayTemp是unicode 。

str和unicode和是unicode對象。 請注意，如果str對象不僅是ASCII字符，這將失敗。

在這種情況下，代碼實際上會執行以下操作（ \\xb0表示°）：

f.write(u'20090305,11\xb0\n')

問題是無法將unicode字符直接寫入文件。 它們只是一個抽象，沒有一種獨特的格式可以用^*編寫它們。 您必須選擇一個。 最好的選擇通常是UTF-8。

s = (timestamp + ',' + dayTemp + '\n').encode('utf-8')
# or, cleaner:
s = u'{},{}\n'.format(timestamp, dayTemp).encode('utf-8')
f.write(s)

另一個選擇是擁有一個更智能的 file對象，該file對象會自動將unicode編碼為UTF-8，如其他人所建議的那樣：

with io.open('wunder-data.txt', 'w', encoding='utf-8') as f:
    f.write(timestamp + ',' + dayTemp + '\n')

要么

with io.open('wunder-data.txt', 'w', encoding='utf-8') as f:
    f.write(timestamp + ',' + dayTemp + '\n')

^{*實際上，ASCII是一種唯一的格式，但是僅當所有字符都可以用ASCII表示時，該格式才有效。}

UnicodeEncodeError：'ascii'編解碼器無法在位置11編碼字符u'\\ xb0'：序數不在范圍內（128）

問題描述

2 個解決方案

解決方案1
1 2017-03-12 08:32:28

解決方案2
1 2017-03-12 09:12:26

UnicodeEncodeError：&#39;ascii&#39;編解碼器無法在位置11編碼字符u&#39;\\ xb0&#39;：序數不在范圍內（128）

問題描述

2 個解決方案

解決方案1 1 2017-03-12 08:32:28

解決方案2 1 2017-03-12 09:12:26

UnicodeEncodeError：'ascii'編解碼器無法在位置11編碼字符u'\\ xb0'：序數不在范圍內（128）

解決方案1
1 2017-03-12 08:32:28

解決方案2
1 2017-03-12 09:12:26