简体   繁体   中英

Extract specific information from many xml files

I have a script which is exporting our users software and run registry keys to xml files in order for us to track suspicious files. I have several hundred xml files which consists of:

<MDetection>
   <SoftwareReg>
      <Soft1>HKCU\SOFTWARE\Adobe</Soft1>
      <Soft2>HKCU\SOFTWARE\Ask.com</Soft2>
      <Soft3>HKCU\SOFTWARE\Citrix</Soft3>
      <Soft4>HKCU\SOFTWARE\Google</Soft4>
      ...
   </SoftwareReg>
   <RunReg>
      <Run1>SOFTWARE\Microsoft\Windows\Currentversion\Run\Sidebar-->C:\Program Files\Windows Sidebar\sidebar.exe /autoRun</Run1> 
      <Run2>SOFTWARE\Microsoft\Windows\Currentversion\Run\WindowsWelcomeCenter-->rundll32.exe oobefldr.dll,ShowWelcomeCenter</Run2> 
      ...
   </RunReg>
   <Hostname>USERPC01</Hostname>
   <Username>JonesA</Username>
   <TimeGenerated>03/01/14 11:00</TimeGenerated>
</MDetection>

I would like to be able to search all the XML files for a particular SoftwareReg or RunReg key and also be able to search for entries using a wildcard. For example search for all 'Ask.com' keys or entries beginning with 'cr'.

The problem I am having is that I would like to know the Hostname and the Username that the corresponding reg key refers to. I am unable to extract these.

My Xml Powershell is not the strongest so any assistance would be appreciated! I am currently using:

   $XmlData = Select-Xml -Path '\\Server\Share$\*.xml' -XPath '//MDetection' -ErrorAction 'silentlycontinue'
   $XmlData | %{$a += $_.Node.SoftwareReg.ChildNodes.'#text'}

I am using $a to temporarily store all the reg entries but I cannot extract the hostname or username. I suspect there is an easier way of doing it that that I am currently using!

Many thanks,

Try select ing the information into custom objects:

$XmlData | select @{
    n='Software';e={$_.Node.SoftwareReg.ChildNodes.'#text' | ? {$_ -ne $null}}
  },
  @{n='Hostname';e={$_.Node.Hostname.ToString()}},
  @{n='Username';e={$_.Node.Username.ToString()}}

The above creates objects where the property Software holds an array with the registry paths. For transforming the hierarchical data into tabular data you could try something like this:

$XmlData | % {
  $username = $_.Node.Username.ToString()
  $hostname = $_.Node.Hostname.ToString()

  $_.Node.SoftwareReg.ChildNodes.'#text' | ? { $_ -ne $null } |
    select @{n='Software';e={$_}},
      @{n='Hostname';e={$hostname}},
      @{n='Username';e={$username}}
}

I am having to to do a similar sort of thing.

The method for doing so utilises XSLT, which you can learn about here http://www.w3schools.com/xsl/default.asp The tutorial should be enough for what you want. The language can be made to do quite complex things.

Realizing that using regex to parse XML may be considered heresy:

$search = 'Ask.com'
$Rsearch = [regex]::Escape($search)

$regex=@"
(?ms)<MDetection>
   <SoftwareReg>.+?$Rsearch.+?
   </RunReg>
   <Hostname>(.+?)</Hostname>
   <Username>(.+?)</Username>
   .+
"@


(Get-Content file.xml -Raw) -match $regex > $nul
$matches


Name                           Value                                                                 
----                           -----                                                                 
2                              JonesA                                                                
1                              USERPC01                                                              
0                              <MDetection>...                                                       

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM