简体   繁体   中英

Count lines in zipped files using Windows PowerShell

There is a folder which contains more than 1000 zipped files. Each zipped file contains 12 others zipped files which contains one CSV file each. I need count the total number of lines of all files...

It can be done using windows powershell, but I am in trouble in order to unzip files, count the number of lines and zip it again, in order to save disk space during the process.

$folderPath="C:\_Unzip_Folder";

Get-ChildItem $folderPath -recurse | %{ 

    if($_.Name -match "^*.`.zip$")
    {
        $parent="$(Split-Path $_.FullName -Parent)";    
        write-host "Extracting $($_.FullName) to $parent"

        $arguments=@("e", "`"$($_.FullName)`"", "-o`"$($parent)`"");
        $ex = start-process -FilePath "`"C:\Program Files\7-Zip\7z.exe`"" -ArgumentList $arguments -wait -PassThru;

        if( $ex.ExitCode -eq 0)
        {
            write-host "Extraction successful, deleting $($_.FullName)"
            rmdir -Path $_.FullName -Force
        }
    }
}

Get-ChildItem $folderPath -recurse -Filter *.csv | %{ 
    Get-Content $($_.FullName)  | Measure-Object -Line
}

cmd /c pause | out-null

Now, it is counting lines but, it can be easier, if it SUM them to me.

Does someone can help me with this task?

Thank you all.

You could also keep everything in memory, like this:

Set-StrictMode -Version "Latest"
$ErrorActionPreference = "Stop"
$InformationPreference = "Continue"

Add-Type -Assembly "System.IO.Compression.FileSystem"

$folderPath = "C:\_Unzip_Folder\*.zip"
$files      = Get-ChildItem $folderPath -Recurse
$csvCount   = 0
$lineCount  = 0
$bufferSize = 1MB
$buffer     = [byte[]]::new($bufferSize)

foreach ($file in $files)
{
    Write-Information "Getting information from '$($file.FullName)'"

    $zip  = [System.IO.Compression.ZipFile]::OpenRead($file.FullName)
    $csvs = $zip.Entries | Where-Object { [System.IO.Path]::GetExtension($_.Name) -eq ".csv" }
    foreach ($csv in $csvs)
    {
        $csvCount++
        Write-Information "Counting lines in '$($csv.FullName)'"

        $stream = $csv.Open()
        try
        {
            $byteCount = $stream.Read($buffer, 0, $bufferSize)
            while ($byteCount)
            {
                for ($i = 0; $i -lt $byteCount; $i++)
                {
                    # assume line feed (LF = 10) is the end-of-line marker
                    # you could also use carriage return (CR = 13)
                    if ($buffer[$i] -eq 10) { $lineCount++ }
                }
                $byteCount = $stream.Read($buffer, 0, $bufferSize)
            }
        }
        finally
        {
            $stream.Close()
        }
    }
}

Write-Information "Counted a total of $lineCount line(s) in $csvCount CSV-file(s)"

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM