简体   繁体   中英

Powershell script efficiency advice

it's my first time posting here but I've been a long time lurker everytime I need help with some code.

I'm fairly new to Powershell, and as most people I've been learning by myself trying to code everytime I can, so most of what I do is ugly but it works, I'd like to ask for advice on a script I wrote recently that at the moment of this post has 15 hrs running and is almost at 50% of what it has to do, obviously there's something wrong but I'm not knowledgeable enough to point what it is and any help will be greatly appreciated.

So, I have a telephony .csv with compiled data from january 2020 and some days of february, each row has the date and time spent on each status, since someone uses different status over the day the file has one row for each status, my script is supposed to go through the file, find the minimum date and then start saving on new files all the data for the same day, so I'll end with one file for 01-01-2020, 02-01-2020 and so on, but it has 15 hrs running and it's still at 1/22.

The column I'm using for the dates is called "DateFull" and this is the script

write-host "opening file" 
$AT= import-csv “C:\Users\xxxxxx\Desktop\SignOnOff_20200101_20200204.csv” 
write-host "parsing and sorting file" 
$go= $AT| ForEach-Object {
        $_.DateFull= (Get-Date $_.DateFull).ToString("M/d/yyyy")
        $_
        }

Write-Host "prep day"
$min = $AT | Measure-Object -Property Datefull  -Minimum  

Write-Host $min
$dateString =  [datetime] $min.Minimum
Write-host $datestring

write-host "Setup dates"
$start = $DateString - $today
$start = $start.Days

For ($i=$start; $i -lt 0; $i++)  {
$date = get-date
$loaddate = $date.AddDays($i) 
$DateStr = $loadDate.ToString("M/d/yyyy")
$now = Get-Date -Format HH:mm:ss
write-host $datestr " " $now

#Install-Module ImportExcel #optional import if you dont have the module already
$Check = $at | where {$_.'DateFull' -eq $datestr} 
write-host $check.count
if ($check.count -eq 0 ){}
else {$AT | where {$_.'DateFull' -eq $datestr} | Export-Csv "C:\Users\xxxxx\Desktop\signonoff\SignOnOff_$(get-date (get-date).addDays($i) -f yyyyMMdd).csv" -NoTypeInformation}
}

$at = '' 

Thank you so much for your help

The first loop doesn't make much sense. It loops through CSV contents and converts each row's date into different a format. Afterwards, $go is never used.

$go= $AT| ForEach-Object {
        $_.DateFull= (Get-Date $_.DateFull).ToString("M/d/yyyy")
        $_
        }

Later, there is an attempt to calculate a value from uninitialized a variable. $today is never defined.

$start = $DateString - $today

It looks, however, like you'd like to calculate, in days, how old eldest record is.

Then there's a loop that counts from negative days to zero. During each iteration, the whole CSV is searched:

$Check = $at | where {$_.'DateFull' -eq $datestr} 

If there are 30 days and 15 000 rows, there are 30*15000 = 450 000 iterations. This has complexity of O(n^2), which means runtime will go sky high for even relative small number of days and rows.

The next part is that the same array is processed again:

else {$AT | where {$_.'DateFull' -eq $datestr

Well, the search condition is exactly the same, but now results are sent to a file. This has a side effect of doubling your work. Still, O(2n^2) => O(n^2), so at least the runtime isn't growing in cubic or worse.

As for how to fix this, there are a few things. If you sort the CSV based on date, it can be processed afterwards in just a single run.

$at = $at | sort -Property datefull

Then, iterate each row. Since the rows are in ascending order, the first is the oldest. For each row, check if date has changed. If not, add it to buffer. If it has, save the old buffer and create a new one.

The sample doesn't convert file names in yyyyMMdd format, and it assumes there are only two columns foo and datefull like so,

$sb = new-object text.stringbuilder
# What's the first date?
$current = $at[0]

# Loop through sorted data
for($i = 0; $i -lt $at.Count; ++$i) {

    # Are we on next date?
    if ($at[$i].DateFull -gt $current.datefull) {
        # Save the buffer
        $file = $("c:\temp\OnOff_{0}.csv" -f ($current.datefull -replace '/', '.') )
        set-content $file $sb.tostring()
        # Pick the current date
        $current = $at[$i]

        # Create new buffer and save data there
        $sb = new-object text.stringbuilder
        [void]$sb.AppendLine(("{0},{1}" -f $at[$i].foo, $at[$i].datefull))    
    } else {
        [void]$sb.AppendLine(("{0},{1}" -f $at[$i].foo, $at[$i].datefull))    
    }
}
# Save the final buffer
$file = $("c:\temp\OnOff_{0}.csv" -f ($current.datefull -replace '/', '.') )
set-content $file $sb.tostring()

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM