简体   繁体   中英

Downloading certain files using powershell produce corrupt files

So I have a powershell script that I wrote which crawls through a particular website and downloads all of the software hosted on the site to my local machine. The website in question is nirsoft.net , and I will include the full script below. Anyway, so I have this script that downloads all of the application files hosted on the website, when I notice something odd: while most of the file downloads completed successfully, there are several files that were not downloaded successfully, resulting in a corrupt file of 4KB:

在此处输入图像描述

For those of you who are familiar with Nirsoft's software, the tools are very powerful, but also constantly misidentified as dangerous because of the password cracking tools, so my guess as to why this is happening is that, since powershell's If I were to guess as to why this was happening, I would guess that, due to the fact that powershell's "Invoke-webrequest cmdlet" uses Internet Explorer's engine for its core functionality, Internet Explorer is flagging the files as dangerous and refusing to download them, thus causing powershell to fail to download the file. I confirmed this by trying to manually download each of the corrupt files using internet explorer, which marked them all as malicious. However, this is where things get strange. In order to bypass this limitation, I attempted a variety of other methods to download the file within my script, like using a pure dotnet object ( (New-object System.Net.WebClient).DownloadFile("url","file") ) and even some third party command line tools (wget for windows, wget in cygwin, etc), but no matter what I tried, not a single alternative method I used was able to download a non-corrupt file. So what I want to know is if there is a way around this, and I want to know why even third party tools are affected by this. Is there some kind of rule that any scripting tool has to use Internet Explorer's engine in order to connect to the internet or something? Thanks in advance. Oh, and one last thing before I post the script. Below is the url to one of the files that I am having difficulty in downloading via powershell, which you can use to run individual tests rather than the whole script: enter link description here

And without further ado, here is the script. Thank again:

$VerbosePreference = "Continue"
$DebugPreference = "Continue"
$present = $true
$subdomain = $null
$prods = (Invoke-WebRequest "https://www.nirsoft.net/utils/index.html").links
Foreach ($thing in $prods)
{
    If ($thing.Innertext -match "([A-Za-z]|\s)+v\d{1,3}\.\d{1,3}(.)*")
    {
        If ($thing.href.Contains("/"))
        {
            
        }
        $page = Invoke-WebRequest "https://www.nirsoft.net/utils/$($thing.href)"
        If ($thing.href -like "*dot_net_tools*")
        {
            $prodname = $thing.innerText.Trim().Split(" ")
        }
        Else
        {
            $prodname = $thing.href.Trim().Split(".")
        }
        $newlinks = $page.links | Where-Object {$_.Innertext -like "*Download*" -and ($_.href.endswith("zip") -or $_.href.endswith("exe"))}
       # $page.ParsedHtml.title
        #$newlinks.href
        Foreach ($item in $newlinks)
        {
            $split = $item.href.Split("/")
            If ($item.href -like "*toolsdownload*")
            {
                Try
                {
                    Write-host "https://www.nirsoft.net$($item.href)"
                    Invoke-WebRequest "https://www.nirsoft.net$($item.href)" -OutFile "$env:DOWNLOAD\test\$($split[-1])" -ErrorAction Stop
                }
                Catch
                {
                    Write-Host $thing.href -ForegroundColor Red
                }
            }
            elseif ($item.href.StartsWith("http") -and $item.href.Contains(":"))
            {
                Try
                {
                    Write-host "$($item.href)"
                    Invoke-WebRequest $item.href -OutFile "$env:DOWNLOAD\test\$($split[-1])" -ErrorAction Stop
                }
                Catch
                {
                    Write-Host "$($item.href)" -ForegroundColor Red
                }
            }
            Elseif ($thing.href -like "*/dot_net_tools*")
            {
                Try
                {
                    Invoke-WebRequest "https://www.nirsoft.net/dot_net_tools/$($item.href)" -OutFile "$env:DOWNLOAD\test\$($split[-1])" -ErrorAction Stop
                }
                Catch
                {
                    Write-Host $thing.href -ForegroundColor Red
                }
            }
            Else
            {
                Try
                {
                    Write-Host "https://www.nirsoft.net/utils/$($item.href)"
                    Invoke-WebRequest "https://www.nirsoft.net/utils/$($item.href)" -OutFile "$env:DOWNLOAD\test\$($item.href)" -ErrorAction Stop
                }
                Catch
                {
                     Write-Host $thing.href -ForegroundColor Red
                }
            }
            If ($item.href.Contains("/"))
            {
                If (!(Test-Path "$env:DOWNLOAD\test\$($split[-1])"))
                {
                    $present = $false
                }
            }
            Else
            {
                If (!(Test-Path "$env:DOWNLOAD\test\$($item.href)"))
                {
                    $present = $false
                }
            }
        }
    }
}

If ($present)
{
    Write-Host "All of the files were downloaded!!!" -ForegroundColor Green
}
Else
{
    Write-Host "Not all of the files downloaded.  Something went wrong." -ForegroundColor Red
}

You have two separate issues.

For anything Defender flags, it doesn't matter if you save it to disk with this or that. You could simply add an exclusion for the directory in Defender.

The other issue is pointed out by Guenther, you need to provide a referrer at least on some of the downloads. With the following changes I was able to download them all.

$VerbosePreference = "Continue"
$DebugPreference = "Continue"
$present = $true
$subdomain = $null
$path = c:\temp\downloadtest\
New-Item $path -ItemType Directory -ErrorAction SilentlyContinue | Out-Null

Add-MpPreference -ExclusionPath $path

$prods = (Invoke-WebRequest "https://www.nirsoft.net/utils/index.html").links
Foreach ($thing in $prods)
{
    If ($thing.Innertext -match "([A-Za-z]|\s)+v\d{1,3}\.\d{1,3}(.)*")
    {
        If ($thing.href.Contains("/"))
        {
        
        }
        $page = Invoke-WebRequest "https://www.nirsoft.net/utils/$($thing.href)"
        If ($thing.href -like "*dot_net_tools*")
        {
            $prodname = $thing.innerText.Trim().Split(" ")
        }
        Else
        {
            $prodname = $thing.href.Trim().Split(".")
        }
        $newlinks = $page.links | Where-Object {$_.Innertext -like "*Download*" -and ($_.href.endswith("zip") -or $_.href.endswith("exe"))}
       # $page.ParsedHtml.title
        #$newlinks.href
        Foreach ($item in $newlinks)
        {
            $split = $item.href.Split("/")
            If ($item.href -like "*toolsdownload*")
            {
                Try
                {
                    Write-host "https://www.nirsoft.net$($item.href)"
                    Invoke-WebRequest "https://www.nirsoft.net$($item.href)" -OutFile "$path\$($split[-1])" -ErrorAction Stop -Headers @{Referer="https://www.nirsoft.net$($item.href)"}
                }
                Catch
                {
                    Write-Host $thing.href -ForegroundColor Red
                }
            }
            elseif ($item.href.StartsWith("http") -and $item.href.Contains(":"))
            {
                Try
                {
                    Write-host "$($item.href)"
                    Invoke-WebRequest $item.href -OutFile "$path\$($split[-1])" -ErrorAction Stop -Headers @{Referer="$($item.href)"}
                }
                Catch
                {
                    Write-Host "$($item.href)" -ForegroundColor Red
                }
            }
            Elseif ($thing.href -like "*/dot_net_tools*")
            {
                Try
                {
                    Invoke-WebRequest "https://www.nirsoft.net/dot_net_tools/$($item.href)" -OutFile "$path\$($split[-1])" -ErrorAction Stop -Headers @{Referer="https://www.nirsoft.net/dot_net_tools/$($item.href)"}
                }
                Catch
                {
                    Write-Host $thing.href -ForegroundColor Red
                }
            }
            Else
            {
                Try
                {
                    Write-Host "https://www.nirsoft.net/utils/$($item.href)"
                    Invoke-WebRequest "https://www.nirsoft.net/utils/$($item.href)" -OutFile "$path\$($item.href)" -ErrorAction Stop -Headers @{Referer="https://www.nirsoft.net/utils/$($item.href)"}
                }
                Catch
                {
                     Write-Host $thing.href -ForegroundColor Red
                }
            }
            If ($item.href.Contains("/"))
            {
                If (!(Test-Path "$path\$($split[-1])"))
                {
                    $present = $false
                }
            }
            Else
            {
                If (!(Test-Path "$path\$($item.href)"))
                {
                    $present = $false
                }
            }
        }
    }
}

If ($present)
{
    Write-Host "All of the files were downloaded!!!" -ForegroundColor Green
}
Else
{
    Write-Host "Not all of the files downloaded.  Something went wrong." -ForegroundColor Red
}

I'd also recommend you turn the download routine into a function that you can pass the relative URL portion so you don't have to repeat code several times.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM