简体   繁体   中英

Trying to replace underscore _ with a dash - within an href tag

I'm trying to replace underscores with dashes within an href attribute that is in a large amount of text coming from a database:

Existing Text:

Hello, my name is <a href="http://example.com/joe_smith">joe smith</a> and I  
eat pizza with my friend <a href="http://example.com/john_doe">john doe</a>.

Output:

Hello, my name is <a href="http://example.com/joe-smith">joe smith</a> and I 
eat pizza with my friend <a href="http://example.com/john-doe">john doe</a>.

Since it's currently in a mysql database, I assume it would be faster if I could perform the action using an sql statement, but if that's not possible I would like to do it using a php regex.

I do NOT want to replace underscores that are in the regular text for one reason or another. Only the ones that are within the href.

MySQL's regexes are for searching only. They do not support replacement at all. You can use them to find the records that need fixing, but then you're limited to basic string operations within mysql only for actually changing the records.

You'd be better off pulling the matched records into PHP and doing the changes there. Which of course then brings up the use of regexes on html... don't do it. Use PHP's DOM instead for the actual manipulations.

You can do it with 1 update sql query. I prepared you a test table, and update query to demonstrate. Basically to use on your own table, JUST change table name from TestTable to your table's name and change the name of "Field" to your fields name which you want to update.

If you have multiple a href links in one field. You need to execute query multiple times. You can find maximum link occurences in your table with first query. Than execute update query multiple times. When you update your query at the count of occurence_count than update 1 more query I gave you for clearing some temp data I used.

-- find maximum link occurences in your table

SELECT max(cast((LENGTH(Field) - LENGTH(REPLACE(Field, '<a href', ''))) / 7 as unsigned)) AS occurrence_count 
FROM TestTable;

-- Update your Table occurrence_count times to replace all a href links.

update TestTable 
set Field = replace
                (
                   @b:=replace
                   (
                     @a:=replace(Field
                      , substring(Field, Instr(Field, "<a href='"), Instr(Field, "</a>")-Instr(Field, "<a href='")+4)
                      , replace(substring(Field, Instr(Field, "<a href='"), Instr(Field, "</a>")-Instr(Field, "<a href='")+4), "_", "-")
                      )
                     , substring(@a, Instr(@a, "<a href='"), Instr(@a, "</a>")-Instr(@a, "<a href='")+4)
                     , replace(substring(@a, Instr(@a, "<a href='"), Instr(@a, "</a>")-Instr(@a, "<a href='")+4), "<a href=", "<*a href=")
                   )
                 , substring(@b, Instr(@b, "<*a href='"), Instr(@b, "</a>")-Instr(@b, "<*a href='")+4)
                 , replace(substring(@b, Instr(@b, "<*a href='"), Instr(@b, "</a>")-Instr(@b, "<*a href='")+4), "</a>", "</*a>")
                )
;

-- run this once when all your updates finishes to clear stars from a href links.

update TestTable set Field = replace(replace(Field, "<*a href", "<a href"), "</*a>", "</a>")

-- check your table

select * from TestTable;

TEST TABLE

CREATE TABLE `testtable` (
    `id` INT(11) NOT NULL AUTO_INCREMENT,
    `Field` VARCHAR(255) NOT NULL DEFAULT '',
    PRIMARY KEY (`id`)
)
COLLATE='latin1_swedish_ci'
ENGINE=MyISAM
ROW_FORMAT=DEFAULT

TEST DATA

Insert into TestTable (Field) values ("Hello, my name is <a href='http://example.com/joe_smith'>joe smith</a> and I eat pizza with my friend <a href='http://example.com/john_doe'>john doe</a>");
Insert into TestTable (Field) values ("Hello, my name is <a href='http://example.com/joe_smith'>joe smith</a> and I eat pizza with my friend <a href='http://example.com/john_doe'>john doe</a> my friend <a href='http://example.com/john_doe'>jane doe</a>");
(<a href=".+?)_(.+?">)

有更换

$1-$2

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM