简体   繁体   中英

Python 3.6.1 - Printing string as human readable text, special characters

I'm building a little django 1.1 app (though I believe this issue to be specific to Python) where I've come to use commands to control the flow of getting and categorizing data. I also wish to print a sort of summary using a third command. I am using macOS 10.12.3

My problem comes from getting text data in and printing it to the console or a document using

> or >>

in the console.

I'm running these scripts using an alias of Python 3.6.1

I'm using the Tweepy api, but that should hopefully not be relevant.

These snippets should illustrate the problem I'm hoping to solve:

print(type(data))
print(type(data.text))
try:
    print(data.text)
except UnicodeEncodeError:
    print("no printing today :(")
print(type(data.text.encode('UTF-8')))
print(data.text.encode('UTF-8'))

this outputs:

<class 'tweepy.models.Status'>
<class 'str'>
no printing today :(
<class 'bytes'>
b'kontroll p\xc3\xa5 ... v\xc3\xa5pen.'

The ugly things there should both be the character 'å'.

This is the error that would be thrown:

UnicodeEncodeError: 'ascii' codec can't encode character '\xe5' in position 223: ordinal not in range(128)

It says 'ascii' codec, but doing (in my Python 3.6.1 script):

print(sys.getdefaultencoding())

outputs:

utf-8

Running

print(sys.getdefaultencoding())

again in Python 2.7.10 outputs:

ascii

So the thrown error matches what 2.7.10 outputs. I am not discounting the possibility that I could be wrong about what a default encoder does

I have also tried

export LOCALE="no_NB.UTF-8"

in an attempt to see if this could be caused by my system (unless I'm misunderstanding what this does). I did not write this to any file, thinking it would persist through the current session.

Is the wrong encoder being used somehow? Could it be my terminal encoding? How can I write my special characters to the terminal and file? Are strings really this hard to get right?

Any help is greatly appreciated!!

Setting

export LC_ALL=no_NO.UTF-8
export LANG=no_NO.UTF-8

in my .bash_profile now allows me to see the characters I want in my terminal and it is also successfully echoed to a file.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM