简体   繁体   中英

Python unicode error in linux but not windows

I followed some guides to piece together this bit of python

import requests
import sys
from bs4 import BeautifulSoup

url = requests.get(sys.argv[1])

html = BeautifulSoup(url.content,'html.parser')

for br in html.find_all("br"):
    br.replace_with(" ")

for tr in html.find_all('tr'):
    data = []   

    for td in tr.find_all('td'):
        data.append(td.text.strip())

    if data:
        print("{}".format(','.join(data)))

In Windows it works as I expect it to.

In Linux I get

Traceback (most recent call last):
  File "html2csv.py", line 19, in <module>
    print("{}".format(','.join(data)))
UnicodeEncodeError: 'ascii' codec can't encode character u'\xb0' in position 4: ordinal not in range(128)

What do I need to change in my script to prevent this? I read that you can ignore problem characters but some say this isn't the proper way to do it? Not sure how to implement any of the solutions I found into what I have.

Sorry for wasting your time.

I was using...

python script.py

Which defaults to 2.7

What I needed to run is...

python3 script.py

I had the same issue, it seems that coding in MS Windows leave some ghost chars (guess you can configure your IDE not to do so).

Try adding # -*- coding: utf-8 -*- at the top of your script file as here:

#!/usr/bin/env python

# -*- coding: utf-8 -*-

# import ipdb; ipdb.set_trace()

import json
import os, sys

class CSV_LOADER():
    """
    Script that handles batch credentials (in CSV format), both locally and
    to remote machines.

...

Likely your Python IO encoding is set to ascii for some reason (likely due to misconfigured system locale settings), so everything printed to standard output (and read from standard input) is interpreted as ASCII.

Set the PYTHONIOENCODING environment variable to utf-8 before running your script (or better yet, ensure your system's locale settings are correct).

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM