简体   繁体   English

如何通过 Powershell 从在线 ZIP 存档中仅下载单个文件?

[英]How to download only a single file from an online ZIP archive via Powershell?

I want to download only a single file from an online ZIP archive via Powershell.我只想通过 Powershell 从在线 ZIP 存档下载单个文件。 For this I created a demo-code which is already working, but I am still struggling to get the correct parsing logic on the ZIP-directory.为此,我创建了一个已经可以运行的演示代码,但我仍在努力在 ZIP 目录上获取正确的解析逻辑。 Here is the code I have so far:这是我到目前为止的代码:

# demo code downloading a single DLL file from an online ZIP archive
# and extracting the DLL into memory for mounting it to the main process.

cls
Remove-Variable * -ea 0

# definition for the ZIP archive, the file to be extracted and the checksum:
$url = 'https://github.com/sshnet/SSH.NET/releases/download/2020.0.1/SSH.NET-2020.0.1-bin.zip'
$sub = 'net40/Renci.SshNet.dll'
$md5 = '5B1AF51340F333CD8A49376B13AFCF9C'

# prepare HTTP client:
Add-Type -AssemblyName System.Net.Http
$handler = [System.Net.Http.HttpClientHandler]::new()
$client  = [System.Net.Http.HttpClient]::new($handler)

# get the length of the ZIP archive:
$req = [System.Net.HttpWebRequest]::Create($url)
$req.Method = 'HEAD'
$length = $req.GetResponse().ContentLength
$zip = [byte[]]::new($length)

# get the last 10k:
# how to get the correct length of the central ZIP directory here?
$start = $length-10kb
$end   = $length-1
$client.DefaultRequestHeaders.Add('Range', "bytes=$start-$end")
$result = $client.GetAsync($url).Result
$last10kb = $result.content.ReadAsByteArrayAsync().Result
$last10kb.CopyTo($zip, $start)

# get the block containing the DLL file:
# how to get the exact file-offset from the ZIP directory?
$start = $length-3537kb
$end   = $length-3201kb
$client.DefaultRequestHeaders.Clear()
$client.DefaultRequestHeaders.Add('Range', "bytes=$start-$end")
$result = $client.GetAsync($url).Result
$block = $result.content.ReadAsByteArrayAsync().Result
$block.CopyTo($zip, $start)

# extract the DLL file from archive:
Add-Type -AssemblyName System.IO.Compression
$stream = [System.IO.Memorystream]::new()
$stream.Write($zip,0,$zip.Length)
$archive = [System.IO.Compression.ZipArchive]::new($stream)
$entry = $archive.GetEntry($sub)
$bytes = [byte[]]::new($entry.Length)
[void]$entry.Open().Read($bytes, 0, $bytes.Length)

# check MD5:
$prov = [Security.Cryptography.MD5CryptoServiceProvider]::new().ComputeHash($bytes)
$hash = [string]::Concat($prov.foreach{$_.ToString("x2")})
if ($hash -ne $md5) {write-host 'dll has wrong checksum.' -f y ;break}

# load the DLL:
[void][System.Reflection.Assembly]::Load($bytes)

# use the single demo-call from the DLL:
$test = [Renci.SshNet.NoneAuthenticationMethod]::new('test')
'done.'

Only open point in this code is the correct method to identify the length of the central directory at the end of the ZIP archive and how to get the correct file-offset for the single file to be extracted (in my code I just found the ranges by pure try&error).此代码中唯一的开放点是识别 ZIP 存档末尾中央目录长度的正确方法,以及如何为要提取的单个文件获取正确的文件偏移量(在我的代码中,我刚刚找到了范围通过纯粹的尝试和错误)。

I already checked this wiki https://en.wikipedia.org/wiki/ZIP_(file_format)#Structure and also the PKWARE definitions https://gist.github.com/steakknife/820b73ebf25146180198febdb6f0e183 but beside the block definitions I could not find a programmatical approach to get the offset for ethe EOCD and the individual file. I already checked this wiki https://en.wikipedia.org/wiki/ZIP_(file_format)#Structure and also the PKWARE definitions https://gist.github.com/steakknife/820b73ebf25146180198febdb6f0e183 but beside the block definitions I could not find一种获取 EOCD 和单个文件的偏移量的编程方法。 Can someone help here, please?有人可以帮忙吗?

After a couple of additional tests I came to this solution:经过几个额外的测试,我来到了这个解决方案:

# demo code downloading a single DLL file from an online ZIP archive
# and extracting the DLL into memory to mount it finally to the main process.

cls
Remove-Variable * -ea 0

# definition for the ZIP archive, the file to be extracted and the checksum:
$url = 'https://github.com/sshnet/SSH.NET/releases/download/2020.0.1/SSH.NET-2020.0.1-bin.zip'
$sub = 'net40/Renci.SshNet.dll'
$md5 = '5B1AF51340F333CD8A49376B13AFCF9C'

'prepare HTTP client:'
Add-Type -AssemblyName System.Net.Http
$handler = [System.Net.Http.HttpClientHandler]::new()
$client  = [System.Net.Http.HttpClient]::new($handler)

'get the length of the ZIP archive:'
# dont use System.Web.HttpRequest, it is frequently hanging:
$req = [System.Net.Http.HttpRequestMessage]::new('HEAD', $url)
$result = $client.SendAsync($req).Result
$zipLength = $result.Content.Headers.ContentLength
$zip = [byte[]]::new($zipLength)
$req.Dispose()

'get the last 10k:'
$start = $zipLength-10kb
$end   = $zipLength-1
$client.DefaultRequestHeaders.Add('Range', "bytes=$start-$end")
$result = $client.GetAsync($url).Result
$last10kb = $result.content.ReadAsByteArrayAsync().Result
$last10kb.CopyTo($zip, $start)

"get the 'End of CD' block:"
$enc = [System.Text.Encoding]::GetEncoding(28591)
$end = $enc.GetString($last10kb, $last10kb.Length-256, 256)
$eocd = [regex]::Match($end, 'PK\x05\x06.*').value
$eocd = $enc.GetBytes($eocd)

'get the central directory:'
$cdLength = [bitconverter]::ToUInt32($eocd, 12)
$cdStart  = [bitconverter]::ToUInt32($eocd, 16)
$cd = [byte[]]::new($cdLength)
[array]::Copy($zip, $cdStart, $cd, 0, $cdLength)

'search all file headers for correct file name:'
$fileHeaders = [regex]::Split($enc.GetString($cd),'PK\x01\x02')
foreach ($header in $fileHeaders) {
    $len = $header.Length
    if ($len -ge 42) {
        $bytes = $enc.GetBytes($header)
        $nameLength = [bitconverter]::ToUInt16($bytes, 24)
        if ($nameLength -eq $sub.length -and ($nameLength + 42) -le $len) { 
            $name = $header.Substring(42, $nameLength)
            if ($name -eq $sub) {
                $size   = [bitconverter]::ToUInt32($bytes, 16) + 256
                $start  = [bitconverter]::ToUInt32($bytes, 38)
                break
            }
        }
    }
}
if (!$start) {write-host 'we could not find file in the ZIP archive' -f y ;break}

'get the block containing the file:'
$end   = $start+$size
$client.DefaultRequestHeaders.Clear()
$client.DefaultRequestHeaders.Add('Range', "bytes=$start-$end")
$result = $client.GetAsync($url).Result
$block = $result.content.ReadAsByteArrayAsync().Result
$block.CopyTo($zip, $start)
$client.dispose()

'extract the DLL file from archive:'
Add-Type -AssemblyName System.IO.Compression
$stream = [System.IO.Memorystream]::new()
$stream.Write($zip,0,$zip.Length)
$archive = [System.IO.Compression.ZipArchive]::new($stream)
$entry = $archive.GetEntry($sub)
$bytes = [byte[]]::new($entry.Length)
[void]$entry.Open().Read($bytes, 0, $bytes.Length)

'check MD5:'
$prov = [Security.Cryptography.MD5CryptoServiceProvider]::new().ComputeHash($bytes)
$hash = [string]::Concat($prov.foreach{$_.ToString("x2")})
if ($hash -ne $md5) {write-host 'dll has wrong checksum.' -f y ;break}

'load the DLL:'
[void][System.Reflection.Assembly]::Load($bytes)

'use the single demo-call from the DLL:'
$test = [Renci.SshNet.NoneAuthenticationMethod]::new('test')
'done.'

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM