简体   繁体   English

Powershell-从.zip存档读取文件失败

[英]Powershell - Reading file from .zip archive fails

I'm currently trying to read the contents of a specific .xml file in a .zip archive without extracting it. 我目前正在尝试读取.zip归档文件中特定.xml文件的内容,而不提取该文件。
The code is simple, but somehow a couple of bytes slip into the buffer, making it impossible to use the contents of the file. 代码很简单,但是以某种方式将几个字节滑入缓冲区,使得无法使用文件的内容。

This is the respective code: 这是各自的代码:

    [void] [System.Reflection.Assembly]::LoadWithPartialName("System.IO.Compression.FileSystem")
    $arch = [System.IO.Compression.ZipFile]::OpenRead("C:\file.zip")

    $entr = $arch.Entries | ?{$_.Name -like "test.xml"}
    if(!$entr)
    {throw [System.Exception] "Could not find the .xml file"}

    $buf = New-Object System.Byte[]($entr.Length)
    $entr.Open().Read($buf, 0, $entr.Length) | Out-Null

    $xml = [xml] ([System.Text.Encoding]::Unicode.GetString($buf))

The code is pretty straightforward I'd say, but sadly the first two bytes of $buf seem always equal to 255 and 254 , which causes Powershell's xml parser to throw an exception. 我说的代码非常简单,但是可悲的是$buf的前两个字节似乎总是等于255254 ,这导致Powershell的xml解析器抛出异常。
As a temporary workaround I tried omitting the first two bytes, but that simply caused the same problem to occur with the last two bytes. 作为一种临时的解决方法,我尝试省略了前两个字节,但这仅导致后两个字节发生相同的问题。

That leads me to my question, how is it possible that the buffer is messed up? 这就引出我的问题,缓冲区怎么可能弄乱了?
Is my way of doing this wrong? 我的方法做错了吗? What did I miss? 我错过了什么?

Any help is higly appreciated! 任何帮助,我们将不胜感激!

UPDATE: 更新:

Well, it seems as Windows uses UTF-16 as internal encoding, which would mean that the two first bytes are the Byte Order Mark (BOM) . 嗯,好像Windows使用UTF-16作为内部编码一样,这意味着前两个字节是Byte Order Mark (BOM) I would expect the GetString() method to recognize the BOM , could someone clarify on this? 我希望GetString()方法能够识别BOM ,有人可以对此进行澄清吗?

You'll want to wrap your Stream into a StreamReader and then use the ReadToEnd() method, I'd expect that to respect the BOM: 您将需要将Stream包装到StreamReader ,然后使用ReadToEnd()方法,我希望这样做能够尊重BOM:

$reader = new-object System.IO.StreamReader($entr.Open())
$contents = $reader.ReadToEnd()
$reader.Close()

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM