简体   繁体   English

正则表达式查找和替换印度字符

[英]Regex to Find And Replace Indic Characters

I have a text file as follows我有一个文本文件如下

{{https://www.test.com/events/test-event-२०१८/२०१८-.१-entry-list|caption=test event of २०१८}}{{https://www.test.com/events/test-event-३१/३१-.१-entry-list|caption=test event of ३१}}{{https://www.test.com/events/test-event-१८/१८-.१-entry-list|caption=test event of १८}}

I want to change all the instances of indic characters to their english equivalent only in the URL(s) and not the captions.我想将所有印度语字符的实例更改为仅在 URL 中而不是在标题中的英文等效项。

For Ex: २ becomes 2 and so on.例如: २ 变为 2,依此类推。 I am trying to write an RegEx which will replaces all the instances between "/" of the URLs.我正在尝试编写一个 RegEx,它将替换 URL 的“/”之间的所有实例。 Having no luck so far!.到目前为止没有运气!

My code is as follows我的代码如下

<?php
$pattern = "/\/([२]+)\//u";
$text=file_get_contents("Test.txt");
$text = preg_replace($pattern,'2',$text);
file_put_contents("MR-Test.txt",$text);
?>

Nothing seems to work so far!到目前为止似乎没有任何效果!

Edit: The url(s) i am using are in a text file and i have to replace only the indic text in URL and no where else..编辑:我使用的 url(s) 在一个文本文件中,我必须只替换 URL 中的印度语文本,而不是其他地方..

Our desired comes first, which we collect, then we collect the undesired s using alternation:我们想要的首先出现,我们收集它,然后我们使用交替收集不需要的

(२)|(caption=(.+?)}})

finally, we would be replacing that with 2 and $2 .最后,我们将用2$2替换它。

Demo演示

Test测试

$re = '/(२)|(caption=(.+?)}})/m';
$str = '{{https://www.test.com/events/test-event-२०१८/२०१८-.१-entry-list|caption=test event of २०१८}}{{https://www.test.com/events/test-event-३१/३१-.१-entry-list|caption=test event of ३१}}{{https://www.test.com/events/test-event-१८/१८-.१-entry-list|caption=test event of १८}}';
$subst = '2$2';

$result = preg_replace($re, $subst, $str);

echo $result;

Output输出

{{https://www.test.com/events/test-event-2०१८/2०१८-.१-entry-list|2caption=test event of २०१८}}{{https://www.test.com/events/test-event-३१/३१-.१-entry-list|2caption=test event of ३१}}{{https://www.test.com/events/test-event-१८/१८-.१-entry-list|2caption=test event of १८}}

Here is a way to do the job with preg_replace_callback, first find the right digits to replace then replace them in the callback:这是使用 preg_replace_callback 完成这项工作的一种方法,首先找到要替换的正确数字,然后在回调中替换它们:

$text = '{{https://www.test.com/events/test-event-२०१८/२०१८-.१-entry-list|caption=test event of २०१८}}{{https://www.test.com/events/test-event-३१/३१-.१-entry-list|caption=test event of ३१}}{{https://www.test.com/events/test-event-१८/१८-.१-entry-list|caption=test event of १८}}';

$res = preg_replace_callback('/caption=.+?}}(*SKIP)(*F)|[०१२३४५६७८९]/u', 
                    function($m) {
                        return preg_replace(
                            array('/०/','/१/','/२/','/३/','/४/','/५/','/६/','/७/','/८/','/९/'), 
                            array('0','1','2','3','4','5','6','7','8','9'), 
                            $m[0]);
                    }
                    , $text);
echo $res,"\n";

Output:输出:

{{https://www.test.com/events/test-event-2018/2018-.1-entry-list|caption=test event of २०१८}}{{https://www.test.com/events/test-event-31/31-.1-entry-list|caption=test event of ३१}}{{https://www.test.com/events/test-event-18/18-.1-entry-list|caption=test event of १८}}

Explanation:解释:

caption=.+?}}       # matches caption until }}
(*SKIP)(*F)         # and skip that match
|                   # OR
[०१२३४५६७८९]          # 1 digit

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM