简体   繁体   English

文件流 - ValueError:嵌入空字节

[英]File Stream - ValueError: embedded null byte

I'm trying to download a .png image via HTTP requests and upload it via HTTP to another location.我正在尝试通过 HTTP 请求下载 .png 图像并通过 HTTP 将其上传到另一个位置。 My objective is to avoid saving the file on the disk so it's processed in-memory.我的目标是避免将文件保存在磁盘上,以便在内存中进行处理。

I have the code below:我有下面的代码:

  1. Download the file and convert it into a byte array:下载文件并将其转换为字节数组:
resp = requests.get(
    'http://www.personal.psu.edu/crd5112/photos/PNG%20Example.png',
    stream=True)

img = BytesIO(resp.content)
  1. Upload the file to a remote HTTP repository将文件上传到远程 HTTP 存储库
data=open(img.getvalue()).read()

r = requests.post(url=url, data=data, headers=headers, auth=HTTPBasicAuth('user', 'user'))

I'm getting a ValueError exception "embedded null byte" when reading the byte array.读取字节数组时,我收到 ValueError 异常“嵌入式空字节”。

If I save the file onto the disk and load it as below, then there is no error:如果我将文件保存到磁盘并按如下方式加载它,则没有错误:

with open('file.png', 'wb') as pic:
  pic.write(img.getvalue())

Any advice on how I could achieve it without saving the file on the disk ?关于如何在不将文件保存在磁盘上的情况下实现它的任何建议?

Yes, you can do this without saving to the disk.是的,您可以在不保存到磁盘的情况下执行此操作。 Before that, the error occurred in line在此之前,错误发生在行

data=open(img.getvalue()).read()

Since the inbuild string operation is not good with different encodings this error occured.由于 inbuild 字符串操作不适用于不同的编码,因此发生了此错误。 use the pillow library to meddle with image realated situations使用枕头库来处理图像相关的情况

from io import BytesIO
from PIL import Image    
img = BytesIO(resp.content)
-#data=open(img).read()
+data = Image.open(img)

this will give you a following object type这将为您提供以下对象类型

<class 'PIL.PngImagePlugin.PngImageFile'>

you can use this data variable as your data in the upload post request您可以将此数据变量用作上传发布请求中的数据

@AmilaMGunawardana Thanks for the pointer. @AmilaMGunawardana 感谢您的指点。

I just had to save the image into a separate byte stream to get it uploaded properly:我只需要将图像保存到单独的字节流中即可正确上传:

img = BytesIO(resp.content)

data = Image.open(img, 'r')

buf = BytesIO()

data.save(buf, 'PNG')

r = requests.post(url=url, data=buf.getvalue(), headers=headers, auth=HTTPBasicAuth('user', 'user'))

I believe that the embedded null byte error is caused by a filename input requirement of a library that is supporting whatever operation is being executed in your code.我相信嵌入的空字节错误是由支持在代码中执行的任何操作的库的文件名输入要求引起的。 By using a BytesIO object this presents itself to that library "as if" it is wrapped inside a file.通过使用BytesIO对象,这将自身呈现给该库,“就好像”它被包装在一个文件中一样。

Here is sample code that I used when trying to address this same issue with a tar file.这是我在尝试使用 tar 文件解决相同问题时使用的示例代码。 This code should be able to satisfy most file input requirements for various other libraries.此代码应该能够满足各种其他库的大多数文件输入要求。

The key that I found here was using the BytesIO object around the remote_file.content being passed into the tarfile.open as a file object.我在这里找到的关键是使用围绕remote_file.contentBytesIO对象作为文件对象传递到tarfile.open Other techniques I attempted did not work.我尝试的其他技术不起作用。

from io import BytesIO
import requests
import tarfile

remote_file=requests.get ('https://download.site.com/files/file.tar.gz')

#Extract tarball contents to memory
tar=tarfile.open(fileobj=BytesIO(remote_file.content))
#Optionally print all folders / files within the tarball
print(tar.getnames())
tar.extractall('/home/users/Documents/target_directory/')

This eliminated the ValueError: embedded null byte and expected str, bytes or os.PathLike object, not _io.BytesIO errors that I was experiencing with other methods.这消除了ValueError: embedded null byte expected str, bytes or os.PathLike object, not _io.BytesIO ValueError: embedded null byteexpected str, bytes or os.PathLike object, not _io.BytesIO我在其他方法中遇到的expected str, bytes or os.PathLike object, not _io.BytesIO错误。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM