[英]Regular Expressions nth match
I realise that this may seem like a stupid request but I'm going to ask anyway. 我意识到这可能看起来像是一个愚蠢的请求,但无论如何我都会问。
I wish to use a regular expression to find every nth comma in a list of numbers ie: 我希望使用正则表达式来查找数字列表中的每个第n个逗号,即:
88574,93243,129659,135504,136357,141052,141619,141619,142195,144622,144946,...
could then have every 4th comma ',' replaced by ',\\r\\n' thereby turning a list of numbers into a grid of 4 by n rows. 然后可以将每个第4个逗号','替换为',\\ r \\ n',从而将数字列表转换为4行n行的网格。
finding all commas was simple : 找到所有逗号很简单:
[^0-9]
which from the above list will find all commas. 从上面的列表中可以找到所有逗号。 How can I now group these matches to exclude three in every four.
我现在如何将这些匹配分组以排除每四个中的三个。
I could do this with PHP preg_matches but I am using this with a mysql regular expression replacement function so would prefer a pure regex answer (if one exists). 我可以用PHP preg_matches来做这个,但我使用的是mysql正则表达式替换函数,所以更喜欢纯正则表达式的答案(如果存在)。
The Function that I'm using in MySQL is below: 我在MySQL中使用的函数如下:
DROP FUNCTION IF EXISTS `regex_replace`$$
CREATE DEFINER=`root`@`127.0.0.1`
FUNCTION `regex_replace`(pattern VARCHAR(1000),replacement VARCHAR(1000),original TEXT)
RETURNS VARCHAR(1000) CHARSET latin1
DETERMINISTIC
BEGIN
DECLARE temp VARCHAR(1000);
DECLARE ch VARCHAR(1);
DECLARE i INT;
SET i = 1;
SET temp = '';
IF original REGEXP pattern THEN
loop_label: LOOP
IF i>CHAR_LENGTH(original) THEN
LEAVE loop_label;
END IF;
SET ch = SUBSTRING(original,i,1);
IF NOT ch REGEXP pattern THEN
SET temp = CONCAT(temp,ch);
ELSE
SET temp = CONCAT(temp,replacement);
END IF;
SET i=i+1;
END LOOP;
END IF;
RETURN temp;
END$$
as you can see the regex it's self does not have to handle complex matching. 你可以看到正则表达式,它自己不必处理复杂的匹配。 Therefore a regular expression which is capable of selecting the nth comma would be sufficient.
因此,能够选择第n个逗号的正则表达式就足够了。
I hope this clarifies the problem. 我希望这能澄清这个问题。
Fin 鳍
EDIT: 编辑:
I have added the lib_mysqludf_preg library to the serve which contains the preg_replace function. 我已将lib_mysqludf_preg库添加到包含preg_replace函数的服务器中。 This is a PCRE implementation for MySQL and should work if I can answer the problem of the regex for selecting every-fourth ',' and replace with ',\\r\\n'.
这是一个针对MySQL的PCRE实现,如果我可以回答正则表达式的问题,选择每四分之一','并用',\\ r \\ n'替换,那么它应该可以工作。
$result = preg_replace('/(?:[^,]*,){4}/', '\0\r\n', $subject);
This matches four comma-delimited values in a row (I'm assuming that you won't have commas inside of strings within a group) and adds a CRLF after them. 这匹配一行中的四个逗号分隔值(我假设你在组内的字符串中没有逗号)并在它们之后添加一个CRLF。
[EDIT] The above is a PHP based solution [编辑]以上是基于PHP的解决方案
For a pure MySQL solution, install lib_mysqludf_preg and use: 对于纯MySQL解决方案,请安装lib_mysqludf_preg并使用:
SELECT preg_replace('/(?:[^,]*,){4}/', '${0}\r\n', `fieldname`) as 'new_layout' from `tablename`;
Many Thanks to all that contributed. 非常感谢所有贡献。
If you want to match every comma, then the more straight-forward pattern ,
will work as well. 如果你想每一个逗号匹配,那么更直接的方式
,
也能发挥作用。
For matching every fourth comma, if MySQL supports look-behind, perhaps you could you could use (?<=(^|\\r\\n)(\\d+,){3}\\d+),
. 为了匹配每四个逗号,如果MySQL支持look-behind,也许你可以使用
(?<=(^|\\r\\n)(\\d+,){3}\\d+),
That assumes that each replacement is performed before the next match is made, however. 然而,假设每次替换都是在下一场比赛之前进行的。 Otherwise perhaps
(?<=^((\\d+,){4})*(\\d+,){3}\\d+),
would work. 否则也许
(?<=^((\\d+,){4})*(\\d+,){3}\\d+),
会起作用。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.