I'm having a battle with a regex. (MOBI creation) I have two files: one with XML, the other an HTML table of contents.
The important parts of the XML:
<navPoint id="_NeedsHTMLid" playOrder="40">
<navLabel><text>Needs anchor text from link.)</text></navLabel>
...
The HTML TOC, of course, looks like: schema.org Article Mark-up
====== Hours and hours... worked with Textpad forever. Saw remarks here, now I'm using NotePad++... some of the regex results are different (NOT that I had it working anyway.) #_[\\b(\\w\\b]
was returning the ID: now? Not so much!
Does anyone know how to yank both the ID and the anchor text out of these? I'd be so grateful.
#_[\\b(\\w\\b]
is not a valid regex. Try _([^"]+)\\b
.
Edited: try [^"]
in place of \\w
.
You can use this to get the id and the anchor text at the same time:
_(\\w+)\\b|([aZ\\s.]+[)]+)
If you want to match the ids and the text, go to Search
> Find
menu (shortcut CTRL + F ) and do the following:
Find what:
id="([a-zA-Z0-9\\-\\:\\_\\.]+)"|<text>(.+?)<\\/text>
Select radio button "Regular Expression"
Then press Find All in Current Document
You can test it with your example at regex101 .
Here's a StackOverflow post about valid id names.
I didn't provided you with a Search and Replace solution, since you didn't mentioned anything about a replacement.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.