简体   繁体   中英

Powershell xml <title> string

I have a directory of xml files and I want to extract the title for each. I am very new to powershell, and have tried the following.

Get-ChildItem -recurse | Get-Content | Select-String -pattern "<title>" -list | Set-Content protid_output.txt

An example of the relevant part of the xml files: < title> protein name < /title>

This outputs the title tag but not the actual title. How can I go through the directory and output the titles to one file?

If you're sure that the all of <title>this title</title> is on a SINGLE line, then try:

Get-ChildItem -recurse | % { 
    ((Get-Content .\test.xml) -match "<title>" -replace '<title>' -replace '</title>').Trim()
} | Set-Content protid_output.txt

If they are more like:

<?xml version="1.0" encoding="ISO-8859-1"?>
<example>
<title>
protein name
</title>
</example>

Then try parsing it to xml-object first(easier to read), but avoid on 10+ MB files. Example:

Get-ChildItem -Recurse | % { 
    $x = [xml](Get-Content $_)
    $x.example.title.Trim()
} | Set-Content protid_output.txt

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM