简体   繁体   中英

Powershell, delete lines of text from html file

I have some reports in html file. I need to place them to excel and make some changes, so I thought I could do those changes beforehand using powershell. Some of the lines are in fixed places, others are not so I need to delete them by making the script recognize a pattern.

Fixed lines starting from top: 12-14,17,19,25-27,30-32,40-42 Fixed lines starting from bottom: 3-13, 48-60

The pattern I need to find and delete, is this:

<td align="center">random string</td>
<td align="left">random string</td>
<td align="left">random string</td>
<td align="left">random string</td>
<td align="right">random string</td>

For the fixed lines I found I can do this:

(gc $maindir\Report23.HTML) | ? {(12..14) -notcontains $_.ReadCount} | out-file $maindir\Report23b.HTML

It works as it deletes the lines 12-14 but I need to put the rest of the fixed line numbers in the same command and I can't seem to figure out how. Also the output file's filesize is twice the original's, which I find weird. I tried using set-content which produces a filesize close to the original but breaks the text encoding in certain parts.

I have no idea how to go about for recognizing the pattern though...

Can't you do something like:

$lines = 12..14
$lines += 17
$lines += 25..27
$lines += 30..32
$lines += 40..42

and then use that array in your where clause:

? {$lines -notcontains $_.ReadCount} 

The output file's filesize is twice the original because the original file was probably ASCII-encoded, the new file is per default Unicode-encoded. Try this:

$length = (gc $maindir\Report23.HTML).length
$rangefrombottom = ($length-60)..($length-48)+($length-13)..($length-3)
$rangefromtop = 12..14+17,19+25..27+30..32+40..42
(gc $maindir\Report23.HTML) | ? {$rangefromtop -notcontains $_.ReadCount} | ? {$rangefrombottom -notcontains $_.ReadCount} | out-file -encoding ASCII $maindir\Report23b.HTML

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM