简体   繁体   中英

Reading lines and get total line after a known text in a file

I have a Powershell script that reads lines and then writes line count in a.csv file. The script skips the first 8 lines (header of the file) and counts the remaining lines to get the count of lines.

Script

$targDir = "C:\Logs\15Sec\Ref.Data\Month\";
$subDir = $(Get-ChildItem "$targDir" –Directory );
foreach($sub in $subDir) {
    Get-ChildItem -Path $sub –Recurse *.log | 
    Select-Object -Property @(
        'FullName'
         @{ Name = "LineCount"; Expression = { 
             (Get-Content -Path $_.FullName | Select-Object -Skip 8 | Measure-Object).Count
         }} 
    ) |
    Export-Csv "C:\test\$($sub.Name).csv" -NoTypeInformation
}

Problem

Sometimes the file header doesn't limit to the first 8 lines. It gets extended to 11 or 15 lines depending on the information/comments in the header.

This gives a wrong count of the total lines because the need is to count lines only after the end of the header.

Each file header ends with the text "END OF HEADER".

How to modify the script to start counting lines only after the text "END OF HEADER".

File Header

the actual file doesn't have 1., 2., 3. numbering, I have added it here for line count reference only

1.     3.03           MET DATA                                RINEX VERSION / TYPE
2. cnvtToRINEX 3.14.0  convertToRINEX OPR  20220511 072939 UTC PGM / RUN BY / DATE 
3. ----------------------------------------------------------- COMMENT            
4. XYXTY3                                                       MARKER NAME         
5. UBWTZ3                                                       MARKER NUMBER       
6.     3    PR    TD    HR                                    # / TYPES OF OBSERV 
7.  0083019.4060  0025010.0967  0253356.6176        0.0000 PR SENSOR POS XYZ/H    
8.                                                            END OF HEADER      
9. 19 03 02 00 00 00  946.0    8.5   93.0
10. 19 03 02 00 05 00  946.0    8.4   93.4
11. 19 03 02 00 10 00  946.0    8.4   93.4
12. 19 03 02 00 15 00  946.0    8.4   94.2

Instead of using Select -Skip , use the intrinsic .Where() method in SkipUntil mode - this will ignore all lines prior to the line containing END OF HEADER and the count of post-header lines is thus the count minus 1:

@{ Name = "LineCount"; Expression = { 
    (Get-Content -Path $_.FullName).Where({$_ -match '\bEND OF HEADER\b'}, 'SkipUntil').Count - 1
}}

Try following:

$targDir = "C:\Logs\15Sec\Ref.Data\Month\";
$subDir = $(Get-ChildItem "$targDir" –Directory );
$table = [System.Collections.ArrayList]::new()
foreach($sub in $subDir) {
Write-Host "sub = " $sub
    $filenames = Get-ChildItem -Path ($sub.Name + "*.log") -Recurse -Force
    foreach($filename in $filenames)
    {
Write-Host "filename = " $filename
       $text = Get-Content -Path $filename
       $newRow = New-Object -TypeName psobject
       $dashLine = $text | select-string -Pattern "---------------" |select LineNumber
       $totalLines = $text | Measure-Object –Line
       $newRow | Add-Member -NotePropertyName Filename -NotePropertyValue $filename
       $newRow | Add-Member -NotePropertyName NumberLines -NotePropertyValue ($totalLines.Lines - $dashLine.LineNumber)
       $table.Add($newRow)  | Out-Null
    }
}
$table | Export-Csv "C:\test\$($sub.Name).csv" -NoTypeInformation

With the help of @MathiasR.Jessen, the correct line of code is -

@{ Name = "LineCount"; Expression = { 
    (Get-Content -Path $_.FullName).Where({$_ -match "END OF HEADER"}, 'SkipUntil').Count - 1
}}```

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM