简体   繁体   中英

RegEx help to replace partial url (Notepad++)

I have text like (similar) to this throughout my file:

<td>
[<a href="/abc123/handouts/files/directory1/somename.pdf" target="_blank">Slides</a> ]  [ [<a href="/abc123/handouts/files/directory2/somename2.pdf" target="_blank">Handout</a> ]</td>

<td>
[<a href="/abc123/handouts/files/directory3/somename343.pdf" target="_blank">Slides</a> ]  [ <a href="/abc123/handouts/files/directory5/somename2324.pdf" target="_blank">Handout</a> ]
</td>

Everything after the "/abc123/handouts/files/" text will be different (directory and .pdf name)

I cant seem to fully figure out how to replace JUST the "directory3/somename343.pdf" portion with say: "XXXXXXX"

my attempts have either produced nothing, or have removed the rest of the line after the first match?

my attempt:

Search For:

<a href="/abc123/handouts/files/.*."

Replace with:

<a href="/abc123/handouts/files/xxxxxxx"

leaves me with this:

[ <a href="/abc123/handouts/files/xxxxxxx">Handout</a> ]

completely removing the first line (link)?

What am I doing wrong? and more so, how is it done correctly?

Thanks!

Your regular expression is greedy (the * without a ? ) so it matches everything, even after the .pdf. To make it non-greedy:

<a href="\/abc123\/handouts\/files\/.*?"

Will match everything inside the quotes, but not including the final quote. Then replace with:

<a href="/abc123/handouts/files/xxxxxxx"

Here's regex101 for you to see: https://regex101.com/r/oY8pI8/2

Javascript version for string replacement.

 var re = /"(\\/abc123\\/handouts\\/files\\/)((?:[a-zA-Z0-9]*\\/)*[a-zA-Z]*.[A-ZA-z]{3,4})"/; var str = '"/abc123/handouts/files/directory1/somename.pdf"'; var newstr = str.replace(re, '"$1XXXXX"'); alert(newstr); 

In essence the above code is broken up into 3 parts. Initial grab

"(/abc123/handouts/files/)

Non capturing group to find further folders

(?:[a-zA-Z0-9]*\/)*

Specific document format

[a-zA-Z]*.[A-ZA-z]{3,4}

Noting that the final folder and document format are wrapped together within a group

((?:[a-zA-Z0-9]*\/)*[a-zA-Z]*.[A-ZA-z]{3,4})

Captures will thus be ordered as follows 0 - Entire match 1 - Initial folder match 2 - Trailing directory and path match

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM