简体   繁体   中英

How do I find & replace a url using Notepad++ and regular expressions?

I've been searching on this for a while now and am yet to find out how to do what exactly I'm trying to do.

I need to search a folder and locate files that contain an href tag with a specific base url. I have accomplished this with the following regular expression:

(href="(https:\/\/www\.mytesturl\.com))

After locating the files and locations where this URL is used, I need to do a replace on the located text. This is where my issue is. The href attribute will definitely contain the text:

https://www.mytesturl.com

Additionally, it may contain any manner of query string values or "/" paths after this.

Ultimately, my find/replace operation needs to yield the result:

href='<%= Request.Url.Scheme + "://" + Request.Url.Host + "<extra>" %>'

Where <extra> is everything from the end of ".com" to the end of the initial href value in quotes.

So

https://www.mytesturl.com?somevar=somevalue&secondvar=secondvalue

Would be:

href='<%= Request.Url.Scheme + "://" + Request.Url.Host + "?somevar=somevalue&secondvar=secondvalue" %>'

and

https://www.mytesturl.com/otherpath?somevar=somevalue&secondvar=secondvalue

Would be:

href='<%= Request.Url.Scheme + "://" + Request.Url.Host + "/otherpath?somevar=somevalue&secondvar=secondvalue" %>'

Can Notepad++ do a regex find/replace such as this?

You already have several problems, and they all stem from using Regexes when you shouldn't use Regexes. Write yourself a little PHP script to iterate through the directory, parse each HTML file, navigate the DOMs to find a tags and inspect their href properties... then rewrite them (for that you can use a regex!).

If you're okay with having false negatives, though (ie some occurrences not found), then yes you can do this … using captures and backreferences.

So, you could search for:

href="https:\/\/www\.mytesturl\.com([^"]*)"
//                                 ^^^^^^^
//                             optional capture
//                         any characters until '"'

and replace it with:

href='<%= Request.Url.Scheme + "://" + Request.Url.Host + "\1" %>'
//                                                         ^^
//                                                 contents of capture
//                                               (which may be nothing!)

As an aside, you really should be using & , not + , for string concatenation in ASP.

Furthermore, the Notepad++ manual (press F1) on the "Find" topic explains that the application uses the Scintilla regular expression engine, and links to the Scintilla documentation , which is a pretty handy reference for this kind of work. Always read the documentation.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM