简体   繁体   English

仅使用正则表达式匹配罗马数字

[英]Match only roman numerals with regular expression

I'm trying to create a reg exp to match only roman numerals, and remove them only when there are other characters before. 我正在尝试创建一个reg exp来匹配罗马数字,并且只有在之前有其他字符时才删除它们。 If there are not other characters before the roman numeral then I don't want to remove it. 如果罗马数字之前没有其他字符,那么我不想删除它。 Here is an example: 这是一个例子:

string1 V
string2 VI
string3 XX
STRING4 I
STRING5 1340 I
2 STRING6 III
STRING7 V
STRING8 III
STRING9 II
STRING10 IV
STRING11 STRING12 VI
STRING13! IX
STRING14 VI
. STRING15 - STRING16_ V
STRING17 1/2 VI
STRING18 VIII
XIII (2011)
V (2012)
String19 VP
XII

the result should be: 结果应该是:

string1
string2
string3
STRING4
STRING5 1340
2 STRING6
STRING7
STRING8
STRING9
STRING10
STRING11 STRING12
STRING13!
STRING14
. STRING15 - STRING16_
STRING17 1/2
STRING18
XIII (2011)
V (2012)
String19 VP
XII

Any help please? 有什么帮助吗?

Thanks 谢谢

edit : I have just tried this: \\b[IVXLCDM]+\\b but it matches: 编辑 :我刚试过这个: \\b[IVXLCDM]+\\b但它匹配:

XIII (2011)
V (2012)
XII

You can use [ ]\\bM{0,4}(CM|CD|D?C{0,3})(XC|XL|L?X{0,3})(IX|IV|V?I{0,3})\\b 您可以使用[ ]\\bM{0,4}(CM|CD|D?C{0,3})(XC|XL|L?X{0,3})(IX|IV|V?I{0,3})\\b

Includes only valid ROMAN NUMERALS. 仅包含有效的 ROMAN NUMERALS。

If you want to include romans without validation you can use [ ]([MDCLXVI]+$) 如果你想在没有验证的情况下包括罗马人你可以使用[ ]([MDCLXVI]+$)

see DEMO . DEMO

<?php

$subjects = <<< EOT
string1 V
string2 VI
string3 XX
STRING4 I
STRING5 1340 I
2 STRING6 III
STRING7 V
STRING8 III
STRING9 II
STRING10 IV
STRING11 STRING12 VI
STRING13! IX
STRING14 VI
. STRING15 - STRING16_ V
STRING17 1/2 VI
STRING18 VIII
XIII (2011)
V (2012)
String19 VP
XII
EOT;

foreach (explode("\n",$subjects) as $subject) {
  $pattern = '/(.+)\s+[IVXLCDM]+\s*$/';
  echo preg_replace($pattern, '\\1', $subject)."\n";
}

This gives the output: 这给出了输出:

string1
string2
string3
STRING4
STRING5 1340
2 STRING6
STRING7
STRING8
STRING9
STRING10
STRING11 STRING12
STRING13!
STRING14
. STRING15 - STRING16_
STRING17 1/2
STRING18
XIII (2011)
V (2012)
String19 VP
XII

Note: this also removes any white spaces between the preceding text and the removed roman number sequence. 注意:这也会删除前面的文本和删除的罗马数字序列之间的任何空格。 If you want to preserve those you have to move the \\n into the brackets before: '/(.+\\s+)[CILMVX]+$/' . 如果你想保留那些你必须将\\n移动到括号之前: '/(.+\\s+)[CILMVX]+$/' 。+ '/(.+\\s+)[CILMVX]+$/'

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM