I tried a lot of ways to convert the string like b'\\xef\\xbb\\xbf\\xe5\\x9b\\xbd\\xe9\\x99\\x85\\xe5\\x8f\\x8b\\xe8\\xb0\\x8a' into Chinese characters but all failed.
It's really strange that when I just use
print(b'\\xef\\xbb\\xbf\\xe5\\x9b\\xbd\\xe9\\x99\\x85\\xe5\\x8f\\x8b\\xe8\\xb0\\x8a')
It will show decoded Chinese Characters.
But if I got the string by reading from my CSV file, it won't do. No matter how I decode the string, it will only show me b'\\xef\\xbb\\xbf\\xe5\\x9b\\xbd\\xe9\\x99\\x85\\xe5\\x8f\\x8b\\xe8\\xb0\\x8a'
Here is my script:
import csv
with open('need_convert.csv','r+') as csvfile:
reader=csv.reader(csvfile)
for row in reader:
new_row=''.join(row)
print('new_row:')
print(type(new_row))
print(new_row)
print('convert:')
print(new_row.decode('utf-8'))
Here is my data (csv file): b'\\xef\\xbb\\xbf\\xe5\\x9b\\xbd\\xe9\\x99\\x85\\xe5\\x8f\\x8b\\xe8\\xb0\\x8a' b'\\xef\\xbb\\xbf\\xe9\\xba\\x92\\xe9\\xba\\x9f\\xe6\\x9d\\xaf' b'\\xef\\xbb\\xbf\\xe5\\x9b\\xbd\\xe9\\x99\\x85\\xe5\\x8f\\x8b\\xe8\\xb0\\x8a'
row
contents and new_row
are both strings, not byte types. Below, I'm using exec('s=' + row[0])
to interpret them as desired, assuming the input is safe.
import csv
with open('need_convert.csv','r+') as csvfile:
reader=csv.reader(csvfile)
for row in reader:
print(type(row[0]), row[0])
exec('s=' + row[0])
print(type(s), s)
print(s.decode('utf-8'))
Output:
<class 'str'> b'\xef\xbb\xbf\xe5\x9b\xbd\xe9\x99\x85\xe5\x8f\x8b\xe8\xb0\x8a'
<class 'bytes'> b'\xef\xbb\xbf\xe5\x9b\xbd\xe9\x99\x85\xe5\x8f\x8b\xe8\xb0\x8a'
国际友谊
<class 'str'> b'\xef\xbb\xbf\xe9\xba\x92\xe9\xba\x9f\xe6\x9d\xaf'
<class 'bytes'> b'\xef\xbb\xbf\xe9\xba\x92\xe9\xba\x9f\xe6\x9d\xaf'
麒麟杯
<class 'str'> b'\xef\xbb\xbf\xe5\x9b\xbd\xe9\x99\x85\xe5\x8f\x8b\xe8\xb0\x8a'
<class 'bytes'> b'\xef\xbb\xbf\xe5\x9b\xbd\xe9\x99\x85\xe5\x8f\x8b\xe8\xb0\x8a'
国际友谊
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.