简体   繁体   English

从int到Windows-1252的Python转换

[英]Python Conversion from int to Windows-1252

I am currently writing a program that reads data from a serial port adds some header information then writes this data to a .jpg file. 我当前正在编写一个程序,该程序从串行端口读取数据,添加一些标头信息,然后将该数据写入.jpg文件。

I require to write to the file in Windows-1252 encoding format, yes the method in which I construct the data and the header is in hexadecimal format. 我需要以Windows-1252编码格式写入文件,是的,我构造数据和标头的格式为十六进制。

I realised my problem when comparing the picture that should be written and what was actually written, and saw that DOULBE LOW 9 QUOTES were not written as quotes but rather as a zero. 当比较应该写的图片和实际写的图片时,我意识到了我的问题,并且看到DOULBE LOW 9 QUOTES不是写为引号,而是写为零。

The decimal code for that symbol is 132 (0x84) . 该符号的十进制代码为132 (0x84) If I use chr(0x84) I get the following error 如果我使用chr(0x84)我得到以下错误

UnicodeEncodeError: 'charmap' codec can't encode character \\x84 in position 0: character maps to UnicodeEncodeError:'charmap'编解码器无法在位置0编码字符\\x84 :字符映射到

Which only makes sense if chr() was trying to map to Latin-1 codeset. 仅当chr()试图映射到Latin-1代码集时才有意义。 I have tried to convert the int to a unicode but from my research chr is the only function that does this. 我试图将int转换为unicode,但是根据我的研究,hr是唯一执行此操作的函数。

I have also tried to use the struct package in python. 我也尝试在python中使用struct包。

import struct
a = 123;
b = struct.pack("c",a)
print(b)

I get the error 我得到错误

Traceback (most recent call last): File "python", line 3, in struct.error: char format requires a bytes object of length 1 追溯(最近一次调用):文件“ python”,第3行,位于struct.error中:char格式需要长度为1的字节对象

Reading past questions, answers and documentation does get quite confusing as there is a mix of python2 and python3 answers mixed in with people converting to ascii (which obviously wouldn't work). 阅读过去的问题,答案和文档确实会造成混乱,因为python2和python3的答案混在一起,而人们转换为ascii(这显然行不通)。

I am using Python 3.4.3 (the latest version) on a Windows 7 machine. 我在Windows 7计算机上使用Python 3.4.3(最新版本)。

UnicodeEncodeError: 'charmap' codec can't encode character \\x84

\\x84 is the encoding of the lower quotes character in Windows-1252. \\x84是Windows-1252中低引号字符的编码。 This suggests your data is already encoded, and you should not try to encode it again. 这表明您的数据已被编码,因此您不应尝试再次对其进行编码。 In a text string the quote should show up as "\„" . 在文本字符串中,引号应显示为"\„" "\„" (the result of chr(132) ) is actually a control character . "\„"chr(132)的结果)实际上是一个控制字符

You should have either bytes which you can decode to a string: 您应该有一个字节可以解码为字符串:

>>> b"\x84".decode('windows-1252')
'\u201e'

Or you should have a text string, which you can encode to a byte string 或者您应该有一个文本字符串,可以将其编码为字节字符串

>>> "\u201e".encode('windows-1252')
b'\x84'

If you read data from somewhere you could use the struct module like this 如果您从某处读取数据,则可以使用struct模块,如下所示

# suppose we download some data:
data=b'*\x00\x00\x00abcde'

a, txt = struct.unpack("I5s", data)
print(txt.decode('windows-1252'))

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM