简体   繁体   中英

How to correctly build RegEx for multiline values in reg file

I would like to get values from a .reg file (REG EXPORT file) so I can compare them to another .reg file. I'm having problems to create the RegEx for this.

facts which make it harder for me:

  1. I don't know what kind of registry key types are being used in the file (that's why I want to build a regex for all the different types like string, dword, qword, multistring,...)
  2. I don't know if the last character in the file is a newline or not
  3. I would like to only return the actual value, eg fa,ad,df,fa,ad,df,fa,ad if the regkey is "qword"=hex(b):fa,ad,df,fa,ad,df,fa,ad
$Text = @'
[HKEY_LOCAL_MACHINE\SOFTWARE\Test]
"String"="asfasdfasasfasdfasasfasdfasasfas"
"Binary"=hex:d3,45,34,53,45,34,53,45,34,53,45,34,53,45,34,53,45,34,5b,09,89,08,\
34,09,8a,ef,02,30,40,9a,ad,fa,d0
"DWORD"=dword:fefefefe
"multistring"=hex(7):61,00,62,00,6c,00,61,00,73,00,66,00,62,00,00,00,62,00,61,\
  00,6c,00,73,00,66,00,62,00,61,00,73,00,64,00,66,00,00,00,62,00,61,00,6c,00,\
  73,00,64,00,66,00,61,00,64,00,6c,00,66,00,00,00,61,00,73,00,64,00,66,00,61,\
  00,73,00,64,00,66,00,00,00,61,00,73,00,64,00,66,00,00,00,61,00,73,00,64,00,\
  00,00,66,00,61,00,73,00,64,00,00,00,66,00,61,00,73,00,64,00,66,00,61,00,73,\
  00,66,00,61,00,73,00,64,00,66,00,00,00,61,00,73,00,64,00,66,00,61,00,73,00,\
  64,00,66,00,61,00,73,00,64,00,00,00,61,00,73,00,64,00,66,00,61,00,73,00,64,\
  00,66,00,00,00,00,00
"qword"=hex(b):fa,ad,df,fa,ad,df,fa,ad
'@

# this one works
$key = "multistring"
$regex = ('(?ms)\"{0}\"=hex\(7\):(.+)\n' -f [RegEx]::Escape($key))
[regex]::Matches($Text, $regex) | foreach { $_.Groups[1].Value }

# this one does not work because there is no newline after the last line...
$key2 = "qword"
$regex2 = ('(?ms)\"{0}\"=hex\(b\):(.+)\n' -f [RegEx]::Escape($key2))
[regex]::Matches($Text, $regex2) | foreach { $_.Groups[1].Value } 

.+ is a greedy expression, and the modifier (?s) makes the . match all characters (including newlines), so (.+)\\n will match everything up to the last newline.

Try something like this:

$regex = '"{0}"=hex\(b\):(.+(?:\n  .+)*)'

You need neither (?m) nor (?s) here, because you don't want . to include newlines, and you don't want to match beginnings or ends of lines inside the multiline string. .+(?:\\n .+)* matches the rest of the line after the prefix hex(b): and all subsequent lines beginning with two consecutive spaces. The (?:...) is just a non-capturing group, since there's no need to capture each line in a separate group.

In your regex you use (?s) which is a modifier that will make the dot match any character including new lines. So .+ will match until the end of all lines.

You could use a capturing group to capture the part after the colon. First match the part uptil a colon using \\"{0}\\"=hex\\(7\\):

Then match what follows until the end of the line and use a negative lookahead to check if what follows is not a line that starts with a word between double quotes followed by an equals sign like "qword"=. As long as that is the case, match the whole string.

Your code could look like:

$regex = \"{0}\"=hex\(7\):(.*(?:(?!\n"[^\n"]+"=)\n.*)*)

Explanation of the second part:

  • ( Capturing group which will hold your value
    • .* Match any character except a newline 0+ times
    • (?: Non capturing group
      • (?! Negative lookahead to assert what follows is not
        • \\n"[^\\n"]+"= Match \\n" , negated character class to match not any of \\n or "
      • )\\n.* Close negative lookahead and match \\n followed by any character except a newline 0+ times
    • )* Close non capturing group and repeat 0+ times
  • ) Close capturing group

Example Pattern

\"multistring\"=hex\(7\):(.*(?:(?!\n"[^\n"]+"=)\n.*)*)

Regex demo

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM