简体   繁体   中英

kate - regex - find and replace portion of URL from href to ">

I have many links contained in .html and .txt files that I would like to modify. I mostly use Kate as my text editor thus I've asked this question with the kate tag. Below is a sample of the links:

 <li> <a href="http://sk1project.org/"> sK1 </a> is an open source vector graphics editor similar to CorelDRAW, Adobe Illustrator, or Freehand. First of all sK1 is oriented for PostScript processing. UniConvertor is a universal vector graphics translator. It uses sK1 engine to convert one format to another. Development of the import/export modules for this program goes through different stages, quality and feature coverage are different among formats. </li>


 <li> <a href="http://tango.freedesktop.org/Tango_Desktop_Project"> The Tango Desktop Project </a> exists to help create a consistent graphical user interface experience for free and Open Source software. While the look and feel of an application is determined by many individual components, some organization is necessary in order to unify the appearance and structure of individual icon sets used within those components. The Tango Desktop Project defines an icon style guideline to which artists and designers can adhere. A sample implementation of the style is available as an icon theme based upon a standardized icon naming specification. In addition, the project provides transitional utilities to assist in creating icon themes for existing desktop environments, such as GNOME and KDE. </li>

I found Regular expression to extract URL from an HTML link | python - Regular expression to extract URL from an HTML link - Stack Overflow so I know how to capture the text from href to "> using href=[\\'"]?([^\\'" >]+">) , but I don't know how to keep the text from href to " prior to the > and add in the following text: ' rel="nofollow noopener noreferrer">'.

I have how the end result should look below:

 <li> <a href="http://sk1project.org/" rel="nofollow noopener noreferrer"> sK1 </a> is an open source vector graphics editor similar to CorelDRAW, Adobe Illustrator, or Freehand. First of all sK1 is oriented for PostScript processing. UniConvertor is a universal vector graphics translator. It uses sK1 engine to convert one format to another. Development of the import/export modules for this program goes through different stages, quality and feature coverage are different among formats. </li>


 <li> <a href="http://tango.freedesktop.org/Tango_Desktop_Project" rel="nofollow noopener noreferrer"> The Tango Desktop Project </a> exists to help create a consistent graphical user interface experience for free and Open Source software. While the look and feel of an application is determined by many individual components, some organization is necessary in order to unify the appearance and structure of individual icon sets used within those components. The Tango Desktop Project defines an icon style guideline to which artists and designers can adhere. A sample implementation of the style is available as an icon theme based upon a standardized icon naming specification. In addition, the project provides transitional utilities to assist in creating icon themes for existing desktop environments, such as GNOME and KDE </li>

How can this be done with regex in Kate?

Thank you.

Parsing html using regex isn't a recommended thing but since you are using Kate editor, you can capture the <a tag with href attribute using this regex,

(<a\s+.*?href=(['"]?)\S*\2)

And replace it with this,

\1 rel="nofollow noopener noreferrer"

I've never used Kate editor so not sure whether \\1 will work or $1

Let me know if this works.

Demo

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM