简体   繁体   中英

PowerShell Script Efficiency

I use PowerShell as much as possible for quick and easy scripting tasks; A lot of times during my job I will use it for data parsing, log file sifting, or for creating CSV\\Text files.

One thing I can't figure out is why it can be very inefficient to perform certain data\\IO tasks. I figure it has to do with something under the hood with the way it handles Pipelines or just something I haven't understood yet.

If you take the following logic to generate ABC123 ids, compile it in PowerShell and execute it, it will take less than 1 minute to complete:

$source = @'
    public static System.Collections.Generic.List<String> GetIds()
    {
        System.Collections.Generic.List<String> retValue = new System.Collections.Generic.List<String>();
        for (int left = 97; left < 123; left++)
        {
            for (int middle = 97; middle < 123; middle++)
            {
                for (int right = 97; right < 123; right++)
                {
                    for (int i = 1; i < 1000; i++)
                    {
                        String tmp = String.Format("{0}{1}{2}000", (char)left, (char)middle, (char)right);
                        retValue.Add(String.Format("{0}{1}", tmp.Substring(0, tmp.Length - i.ToString().Length), i));
                    }
                }
            }
        }
        return retValue;
    }
'@
$util = Add-Type -Name "Utils" -MemberDefinition $source -PassThru -Language CSharp

$start = get-date
$ret = $util::GetIds()
Write-Host ("Time: {0} minutes" -f ((get-date)-$start).TotalMinutes)

Now take the same logic, run it through PowerShell without compiling as an assembly and it takes hours to complete

$start = Get-Date
$retValue = @()
for ($left = 97; $left -lt 123; $left++)
{ 
    for ($middle = 97; $middle -lt 123; $middle++)
    { 
        for ($right = 97; $right -lt 123; $right++)
        { 
            for ($i = 1; $i -lt 1000; $i++)
            { 
                $tmp = ("{0}{1}{2}000" -f [char]$left, [char]$middle, [char]$right)
                $retValue += ("{0}{1}" -f $tmp.Substring(0, $tmp.Length - $i.ToString().Length), $i)
            }
        }
    }
}
Write-Host ("Time: {0} minutes" -f ((get-date)-$start).TotalMinutes)

Why is that? Is there some sort of excessive type casting or inefficient operation I am using that slows down performance?

You're killing your performance right here:

$retValue += ("{0}{1}" -f $tmp.Substring(0, $tmp.Length - $i.ToString().Length), $i)

Array additions are a very "expensive" operation. What you're doing is basically creating a brand new array every time, composed of the original array plus the new element.

Edit: This kind of array addition is not only inefficient, but totally unnecessary. All you have to do is simply output those values to the pipeline, and assign the result back to the variable.

$start = Get-Date
$retValue =
for ($left = 97; $left -lt 123; $left++)
{ 
    for ($middle = 97; $middle -lt 123; $middle++)
    { 
        for ($right = 97; $right -lt 123; $right++)
        { 
            for ($i = 1; $i -lt 1000; $i++)
            { 
                $tmp = ("{0}{1}{2}000" -f [char]$left, [char]$middle, [char]$right)
                "{0}{1}" -f $tmp.Substring(0, $tmp.Length - $i.ToString().Length), $i
            }
        }
    }
}
Write-Host ("Time: {0} minutes" -f ((get-date)-$start).TotalMinutes)
Time: 1.866812045 minutes

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM