简体   繁体   中英

Regular Expressions nth match

I realise that this may seem like a stupid request but I'm going to ask anyway.

I wish to use a regular expression to find every nth comma in a list of numbers ie:

    88574,93243,129659,135504,136357,141052,141619,141619,142195,144622,144946,...

could then have every 4th comma ',' replaced by ',\\r\\n' thereby turning a list of numbers into a grid of 4 by n rows.

finding all commas was simple :

    [^0-9]

which from the above list will find all commas. How can I now group these matches to exclude three in every four.

I could do this with PHP preg_matches but I am using this with a mysql regular expression replacement function so would prefer a pure regex answer (if one exists).

The Function that I'm using in MySQL is below:

    DROP FUNCTION IF EXISTS `regex_replace`$$  

    CREATE DEFINER=`root`@`127.0.0.1`   
    FUNCTION `regex_replace`(pattern VARCHAR(1000),replacement VARCHAR(1000),original TEXT)   
    RETURNS VARCHAR(1000) CHARSET latin1  
        DETERMINISTIC  
    BEGIN  
     DECLARE temp VARCHAR(1000);  
     DECLARE ch VARCHAR(1);  
     DECLARE i INT;  
     SET i = 1;  
     SET temp = '';  
     IF original REGEXP pattern THEN  
     loop_label: LOOP  
       IF i>CHAR_LENGTH(original) THEN  
                 LEAVE loop_label;  
       END IF;  
       SET ch = SUBSTRING(original,i,1);  
         IF NOT ch REGEXP pattern THEN  
            SET temp = CONCAT(temp,ch);  
       ELSE  
          SET temp = CONCAT(temp,replacement);  
                END IF;  
       SET i=i+1;  
              END LOOP;  
     END IF;  
     RETURN temp;  
    END$$  

as you can see the regex it's self does not have to handle complex matching. Therefore a regular expression which is capable of selecting the nth comma would be sufficient.

I hope this clarifies the problem.

Fin

EDIT:

I have added the lib_mysqludf_preg library to the serve which contains the preg_replace function. This is a PCRE implementation for MySQL and should work if I can answer the problem of the regex for selecting every-fourth ',' and replace with ',\\r\\n'.

$result = preg_replace('/(?:[^,]*,){4}/', '\0\r\n', $subject);

This matches four comma-delimited values in a row (I'm assuming that you won't have commas inside of strings within a group) and adds a CRLF after them.

[EDIT] The above is a PHP based solution

For a pure MySQL solution, install lib_mysqludf_preg and use:

    SELECT preg_replace('/(?:[^,]*,){4}/', '${0}\r\n', `fieldname`) as 'new_layout' from `tablename`;

Many Thanks to all that contributed.

If you want to match every comma, then the more straight-forward pattern , will work as well.

For matching every fourth comma, if MySQL supports look-behind, perhaps you could you could use (?<=(^|\\r\\n)(\\d+,){3}\\d+), . That assumes that each replacement is performed before the next match is made, however. Otherwise perhaps (?<=^((\\d+,){4})*(\\d+,){3}\\d+), would work.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM