简体   繁体   English

从图像读取十六进制数据时出现问题-python自动转换为字符串

[英]Problem reading hex data from image - python automatically converts to a string

I am reading in an image one byte at a time with with read(1), and appending it to a list. 我使用read(1)一次读取一个字节的图像,并将其附加到列表中。 The image data is all hex data. 图像数据都是十六进制数据。 When I print out the list with the print function it is in the format '\\xd7' 当我使用print功能打印出列表时,格式为'\\xd7'

['\xd7', '\xd7', '\xd7', '\xd7', '\xd7', '\xd7', '\xd7',...]

The problem is that now I need to perform some calculations on this hex data, however, it is in string format, and this '\\xd' string format isn't supported by any of the int or hex conversion functions in python. 问题是,现在我需要对此十六进制数据执行一些计算,但是,它是字符串格式,而python中的任何int或hex转换函数均不支持这种'\\ xd'字符串格式。 They require a '0xd7' or just a 'd7' . 他们需要'0xd7'或只是'd7'

Thanks for the help 谢谢您的帮助

It's interpreting them as characters, so use ord to turn them into numbers. 它会将它们解释为字符,因此请使用ord将其转换为数字。 Ie ord('\\xd7') gives 215. ord('\\xd7')得到215。

Also if you use Windows, or the program might have to run on Windows, make sure that you've got the file open in binary mode: open("imagefile.png","rb") . 另外,如果您使用Windows,或者该程序可能必须在Windows上运行,请确保以二进制模式open("imagefile.png","rb")文件: open("imagefile.png","rb") Makes no difference on other operating systems. 在其他操作系统上没有区别。

You could do something like this to get them into a numeric array: 您可以执行以下操作将它们放入数值数组中:

import array

data = array.array('B') # array of unsigned bytes

with open("test.dat", 'rb') as input:
    data = input.read(100)
    data.fromstring(data)

print data
# array('B', [215, 215, 215, 215, 215, 215, 215])

read() can take a size value larger than 1: read(1024) will read 1K worth of bytes from the stream. read()可以采用大于1的大小值: read(1024)将从流中读取价值1K的字节。 That will be a lot faster than reading a byte at a time and appending it to the previous bytes. 这将比一次读取一个字节并将其附加到先前的字节要快得多。

What are you trying to do when printing the data? 打印数据时您要做什么? See the byte values, or display the image? 看到字节值,还是显示图像?

The data isn't in "string format", it's just bytes, but when you print them the print routine will escape non-printing values into something that will mean more to human eyes and brains. 数据不是“字符串格式”,而只是字节,但是当您打印它们时,打印例程会将非打印值转义为对人眼和大脑更有意义的东西。 If you want to see the values without the escaping you can iterate over the bytes and convert them to their hexadecimal values, or decimal, or binary - whatever works for you and your application. 如果您想在不进行转义的情况下查看这些值,则可以遍历字节并将其转换为十六进制值,十进制或二进制-不管您和您的应用程序如何工作。 The string formatting mini-language will be a good starting place. 字符串格式迷你语言将是一个很好的起点。

If you require 'd7' or '0xd7' , rather than simply 0xd7 (viz, 215), hex() or '%x' are your friend. 如果您需要'd7''0xd7' ,而不仅仅是0xd70xd7 ,215),那么hex()'%x'是您的朋友。

>>> ord('\xd7')
215
>>> ord('\xd7') == 215 == 0xd7
True
>>> hex(ord('\xd7'))
'0xd7'
>>> '%x' % ord('\xd7')
'd7'

Also as observed in other answers, do make sure you open with the 'b' in the mode, otherwise it can get messed up, thinking it's UTF-8 or something like that, on certain sequences of bytes. 另外,如在其他答案中观察到的那样,请确保在模式下以“ b”打开,否则在某些字节序列上,它可能会因为认为是UTF-8或类似内容而变得混乱。

If you are doing image processing, then you probably want to look at numpy. 如果您正在执行图像处理,那么您可能想看看numpy。

There are a few packages that will help you read your image into memory too (PIL is mentioned above, another is my own mahotas or scikits.image ). 还有一些软件包可以帮助您将图像读入内存(上面提到了PIL,另一个是我自己的mahotasscikits.image )。

If the data is in a file as raw data an you know the dimensions, you can do the following 如果数据作为原始数据存在于文件中,并且您知道尺寸,则可以执行以下操作

import numpy as np
img = np.empty( (n_rows, n_cols), dtype=np.uint8) # create an empty image
img.data[:] = input_file.read()

to get your data into img . 将您的数据导入img

An introductory website for image processing in python is http://pythonvision.org . 一个使用python处理图像的入门网站是http://pythonvision.org

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM