I have a script running in powershell (v2), that removes strings from a file.
The basic process is:
(Get-Content $Local_Dir1\$filename1) -replace 'longString', 'shortString' | `
Set-Content $cfg_Local_Dir\$filename1
Get-Content $Local_Dir1\$filename1 | `
Where-Object {$_ -notmatch 'stringToMatch'} | `
Where-Object {$_ -notmatch 'secondStringToMatch'} | `
Set-Content $Local_Dir1\$filename
This works fine. However, I have an annoying string that I can't get rid of.
It basically consists of: a line break and carriage return, 4 spaces, and then a line break and carriage return. In HEX it is 0D 0A 20 20 20 20 0D 0A
How can I remove this?
I tried simply:
Where-Object {$_ -notmatch ' '} #4 x spaces
But that removed all content after that line (and this is on the second line).
I looked at:
Where-Object {$_ -notmatch '$([char]0x0D)'}
(I would have expanded it if it had removed all the Carriage Returns) which I saw in another post somewhere, but that did nothing.
What is the correct way of dealing with this problem?
Additional: 2015-11-24 13:49
Example Data:
<?xml version="1.0" encoding="UTF-8"?>
<start_of_data>
<job>123456</job>
<name>ABC123</name>
<start></start>
</start_of_data>
<start_of_data>
<job>789012</job>
<name>DEF345</name>
<start></start>
</start_of_data>
Initially there is a string on line 2 which is removed by 'stringToMatch', and the spaces are on line3.
Couple of things worth pointing out here. When you use -match
/ -notmatch
you are using regex. We can consolidate your strings and space issue into one string.
Get-Content $Local_Dir1\$filename1 |
Where-Object {$_ -notmatch 'stringToMatch|secondStringToMatch|\s{4,}'} |
Set-Content $Local_Dir1\$filename
That works using alternation to match either element separated by pipes. This is by no means perfect as we don't have sample data to work with but if you have lines with either of those two string or at least 4 consecutive spaces they will be omitted.
From talking in the comments and looking at the example file you are just trying to omit lines that are blank. Using another string class or regex could fix that. These lines function differently but would both ignore lines that are just white-space.
![string]::IsNullOrWhiteSpace($_)
-notmatch ^\\s+$
I will op'd for the former as it is more intuitive.
Where-Object {![string]::IsNullOrWhiteSpace($_) -and $_ -notmatch 'stringToMatch|secondStringToMatch'}
Like I said in comments if you are picky on this requirement that you could filter out lines with exactly 4 white-space characters with -notmatch ^\\s{4}$
Also like sodawillow says you should have used double quotes to allow variable expansion. Since you are using regex \\r
would have worked just as well.
Where-Object {$_ -notmatch "$([char]0x0D)"}
However I don't think you would have seen that character anyway in order to exclude it. Get-Content
would scrub that out to make a string array. That might depend on encoding.
Try .Net String class:
Where-Object {-not[string]::IsNullOrEmpty(([string]$_).trim())}
Trim
will remove spaces and IsNullOrEmpty
will check the rest.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.