简体   繁体   English

如何使用 pyscript 加载 zip 文件并保存到虚拟文件系统中

[英]How to load a zip file with pyscript and save into the virtual file system

I am trying to load a zip file and save it in the virtual file system for further processing with pyscript.我正在尝试加载 zip 文件并将其保存在虚拟文件系统中,以便使用 pyscript 进行进一步处理。 In this example, I aim to open it and list its content.在这个例子中,我的目标是打开它并列出它的内容。

As far as I got:据我所知:

See the self standing html code below, adapted fromtutorials (with thanks to the author, btw)请参阅下面的自立 html 代码,改编自教程(感谢作者,顺便说一句)

It is able to load Pyscript, lets the user select a file and loads it (although not in the right format it seems).它能够加载 Pyscript,让用户 select 一个文件并加载它(尽管看起来格式不正确)。 It creates a dummy zip file and saves it to the virtual file, and list the content.它创建一个虚拟 zip 文件并将其保存到虚拟文件中,并列出内容。 All this works upfront and also if I point the process_file function to that dummy zip file, it indeed opens and lists it.所有这些都是预先工作的,如果我将 process_file function 指向那个虚拟 zip 文件,它确实会打开并列出它。

The part that is NOT working is when I select via the button/file selector any valid zip file in the local file system, when loading the data into data it is text (utf-8) and I get this error:不工作的部分是当我通过按钮/文件选择器 select 在本地文件系统中任何有效的 zip 文件时,将数据加载到data中时它是文本(utf-8),我收到此错误:

File "/lib/python3.10/zipfile.py", line 1353, in _RealGetContents
    raise BadZipFile("Bad magic number for central directory")
zipfile.BadZipFile: Bad magic number for central directory

I have tried saving to a file and loading it, instead of using BytesIO, also tried variations of using ArrayBuffer or Stream from here I have also tried creating a FileReader and using readAsBinaryString() or readAsText() and various transformations, with same result: either it fails to recognise the "magic number" or I get "not a zip file".我尝试保存到文件并加载它,而不是使用 BytesIO,还尝试了从这里使用 ArrayBuffer 或 Stream 的变体我还尝试创建 FileReader 并使用 readAsBinaryString() 或 readAsText() 和各种转换,结果相同:要么它无法识别“幻数”,要么我得到“不是 zip 文件”。 When feeding some streams or arrayBuffer I get variations of:当喂一些流或 arrayBuffer 时,我会得到以下变化:

 TypeError: a bytes-like object is required, not 'pyodide.JsProxy' 

At this point I suspect there is something embarrassingly obvious that yet I am unable to see, so, any fresh pair of eyes and advice on how best/simply load a file is much appreciated:) Many thanks in advance.在这一点上,我怀疑有一些令人尴尬的明显但我无法看到,因此,非常感谢任何新的眼睛和关于如何最好/简单地加载文件的建议:)非常感谢提前。

<!DOCTYPE html>
<html>

<head>
    <meta charset="UTF-8">
    <meta name="viewport" content="width=device-width, initial-scale=1.0">
    <meta http-equiv="X-UA-Compatible" content="ie=edge">
    <link rel="stylesheet" href="https://pyscript.net/alpha/pyscript.css" />
    <script defer src="https://pyscript.net/alpha/pyscript.js"></script>
    <title>Example</title>
</head>

<body>

    <p>Example</p>
    <br />
    <label for="myfile">Select a file:</label>
    <input type="file" id="myfile" name="myfile">
    <br />
    <br />
    <div id="print_output"></div>
    <br />
    <p>File Content:</p>
    <div style="border:2px inset #AAA;cursor:text;height:120px;overflow:auto;width:600px; resize:both">
        <div id="content">
        </div>
    </div>

    <py-script output="print_output">
        import asyncio
        import zipfile
        from js import document, FileReader
        from pyodide import create_proxy
        import io

        async def process_file(event):
            fileList = event.target.files.to_py()
            for f in fileList:
                data= await f.text()
                mf=io.BytesIO(bytes(data,'utf-8'))

            with zipfile.ZipFile(mf,"r") as zf:
                nl=zf.namelist()
                nlf=" _ ".join(nl)
                document.getElementById("content").innerHTML=nlf

        def main():
            # Create a Python proxy for the callback function
            # process_file() is your function to process events from FileReader
            file_event = create_proxy(process_file)
            # Set the listener to the callback
            e = document.getElementById("myfile")
            e.addEventListener("change", file_event, False)

            mf = io.BytesIO()
            with zipfile.ZipFile(mf, mode="w",compression=zipfile.ZIP_DEFLATED) as zf:
                zf.writestr('file1.txt', b"hi")
                zf.writestr('file2.txt', str.encode("hi"))
                zf.writestr('file3.txt', str.encode("hi",'utf-8'))  
            with open("a.txt.zip", "wb") as f: # use `wb` mode
                f.write(mf.getvalue())
            
            with zipfile.ZipFile("a.txt.zip", "r") as zf:
                nl=zf.namelist()
                nlf=" ".join(nl)

            document.getElementById("content").innerHTML = nlf


        main()
    </py-script>

</body>

</html>

You were very close with your code.你非常接近你的代码。 The problem was in converting the file data to the correct data type.问题在于将文件数据转换为正确的数据类型。 The requirement is to convert the arrayBuffer to Uint8Array and then to a bytearray .要求是将arrayBuffer转换为Uint8Array ,然后转换为bytearray

Import the required function:导入所需的function:

from js import Uint8Array

Read the file data into an arrayBuffer and copy it to a new Uint8Array将文件data读入一个arrayBuffer并复制到一个新的Uint8Array

data = Uint8Array.new(await f.arrayBuffer())

Convert the Uint8Array to a bytearray that BytesIO expectsUint8Array转换为bytearray期望的字节数组

mf = io.BytesIO(bytearray(data))

For reference, based on John Hanley's response (thanks again,), here is the working code: adding a demonstration of saving as binary in the virtual file systems and loading it from that file:作为参考,根据 John Hanley 的回复(再次感谢),这里是工作代码:添加在虚拟文件系统中保存为二进制文件并从该文件加载的演示:

<!DOCTYPE html>
<html>

<head>
    <meta charset="UTF-8">
    <meta name="viewport" content="width=device-width, initial-scale=1.0">
    <meta http-equiv="X-UA-Compatible" content="ie=edge">
    <link rel="stylesheet" href="https://pyscript.net/alpha/pyscript.css" />
    <script defer src="https://pyscript.net/alpha/pyscript.js"></script>
    <title>File Example</title>
</head>

<body>

    <p>Example</p>
    <br />
    <label for="myfile">Select a file:</label>
    <input type="file" id="myfile" name="myfile">
    <br />
    <br />
    <div id="print_output"></div>
    <br />
    <p>File Content:</p>
    <div style="border:2px inset #AAA;cursor:text;height:120px;overflow:auto;width:600px; resize:both">
        <div id="content">
        </div>
    </div>

    <py-script output="print_output">
        import asyncio
        import zipfile
        from js import document, FileReader, Uint8Array
        from pyodide import create_proxy
        import io

        async def process_file(event):
            fileList = event.target.files.to_py()
            for f in fileList:
                data = Uint8Array.new(await f.arrayBuffer())
                mf = io.BytesIO(bytearray(data))
                with zipfile.ZipFile(mf,"r") as zf:
                    nl=zf.namelist()
                    nlf=" ".join(nl)
                    document.getElementById("content").innerText+= "\n Test 2: reading file from local file system: "+f.name+" content:"+nlf
                with open("b.zip","wb") as outb:
                    outb.write(bytearray(data))
                with zipfile.ZipFile("b.zip", "r") as zf:
                    nl=zf.namelist()
                    nlf=" ".join(nl)
                document.getElementById("content").innerText += "\n Test 3: reading the same file but first save it in virtual fs and read it: " + nlf
    
    

        def main():
            # Create a Python proxy for the callback function
            # process_file() is your function to process events from FileReader
            file_event = create_proxy(process_file)
            # Set the listener to the callback
            e = document.getElementById("myfile")
            e.addEventListener("change", file_event, False)

            mf = io.BytesIO()
            with zipfile.ZipFile(mf, mode="w",compression=zipfile.ZIP_DEFLATED) as zf:
                zf.writestr('file1.txt', b"hi")
                zf.writestr('file2.txt', str.encode("hi"))
                zf.writestr('file3.txt', str.encode("hi",'utf-8'))  
            with open("a.zip", "wb") as f: # use `wb` mode
                f.write(mf.getvalue())
            
            with zipfile.ZipFile("a.zip", "r") as zf:
                nl=zf.namelist()
                nlf=" ".join(nl)

            document.getElementById("content").innerText = "Test 1: reading a dummy zip from virtual file system: " + nlf


        main()
    </py-script>

</body>

</html>

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM