[英]Regex: Match all occurrences between specified strings
我正在處理一堆引用圖像文件名的文本文件。 這些文件名已經過清理(將小寫字母和空格替換為連字符)-但是引用它們的文本卻沒有。
我需要像這樣轉換字符串:
(image: uploaded IMAGE.jpg caption: this is my caption)
(image: uploaded IMAGE copy.jpeg caption: this is my caption)
(image: IMG_6087.png caption: this is my caption)
(image: IMG_6087 copy.gif)
(image: IMG_9999_copy.jpg)
(image: somehow, a comma.jpg)
(image: other ridic'ulous characters!.jpg)
至:
(image: uploaded-image.jpg caption: this is my caption)
(image: uploaded-image-copy.jpeg caption: this is my caption)
(image: img_6087.png caption: this is my caption)
(image: img_6087-copy.gif)
(image: img_9999_copy.jpg)
(image: somehow-a-comma.jpg)
(image: other-ridiculous-characters.jpg)
這些字符串是較大的文本塊的一部分,但都位於各自的行上,如下所示:
This is not a short guide to write about art. Go in, out of the window, inside New York’s stars qualities, dreams and schemes. People are gathered together, brewing coffee — you have seen their faces? The artists in Manhattan.
(image: manhattan photo.jpg)
Drive till sunset and say goodbye to your body, because this is not a photograph. I saw sixteen americans, raised by wolves, probably lost in paradise city. I found your head — Do you still want it?
我正在使用Sublime文本,並計划進行多次替換操作:
但是我無法捕獲兩個定界符之間的所有實例。
(?<=^\\(image: )[what do I do here??](?=\\.jpe?g|png|gif)
您可以使用非貪婪的所有人.*?
因此^\\(image: (.*?\\.(:?jpe?g|png|gif))
捕獲包含擴展名的文件名
您可以使用以下方法獲取文件名:
(?<=image:\s)([^.]++)(?=\.jpe?g|\.png|\.gif)
之后,轉換取決於您使用的語言。根據需要添加文件擴展名。 現在,您支持jpg
, jpeg
, png
和gif
。
這是在PHP中完成此工作的方法
<?php
$string =
"This is not a short guide to write about art. Go in, out of the window, inside New York’s stars qualities, dreams and schemes. People are gathered together, brewing coffee — you have seen their faces? The artists in Manhattan.
(image: uploaded IMAGE.jpg caption: this is my caption)
This is not a short guide to write about art. Go in, out of the window, inside New York’s stars qualities, dreams and schemes. People are gathered together, brewing coffee — you have seen their faces? The artists in Manhattan.
(image: uploaded IMAGE copy.jpeg caption: this is my caption)
(image: IMG_6087.png caption: this is my caption)
(image: IMG_6087 copy.gif) blah blah
(image: IMG_9999_copy.jpg)
(image: somehow, a comma.jpg)
(image: other ridic'ulous characters!.jpg)";
echo preg_replace_callback('~(?<=\(image: )(.*?)\.(jpg|jpeg|png|gif)~', function($matches)
{
return preg_replace('~\W~', '-', stripslashes(strtolower($matches[1]))) . ".$matches[2]";
}, $string);
?>
[編輯]添加正則表達式說明:
(?<=image: )
:):是令人反感的-因此請檢查'image:'的存在,但不能捕獲。 (.*?)
:以貪婪的方式捕獲圖像擴展名之前的所有內容-因此匹配的文本越少越好。 \\.(jpg|jpeg|png|gif)
:將匹配.
從字面上看+給定的擴展之一-並捕獲擴展以重用。 ~
:是分隔符,這種選擇只是因為它是在字符串很少使用,不需要\\
的/
\\W
:與\\w
相反,它將匹配任何非字母數字字符。 將輸出(在視圖源中):
This is not a short guide to write about art. Go in, out of the window, inside New York’s stars qualities, dreams and schemes. People are gathered together, brewing coffee — you have seen their faces? The artists in Manhattan.
(image: uploaded-image.jpg caption: this is my caption)
This is not a short guide to write about art. Go in, out of the window, inside New York’s stars qualities, dreams and schemes. People are gathered together, brewing coffee — you have seen their faces? The artists in Manhattan.
(image: uploaded-image-copy.jpeg caption: this is my caption)
(image: img_6087.png caption: this is my caption)
(image: img_6087-copy.gif) blah blah
(image: img_9999_copy.jpg)
(image: somehow--a-comma.jpg)
(image: other-ridic-ulous-characters-.jpg)
然后,您可以使用str_replace()在回調中微調您想將什么字符轉換成什么字符。
希望能幫助到你! ;)
您可以嘗試Jetbrains webstrom前端IDE嗎? 它提供了許多以可讀方式實現任何正則表達式操作的功能。 選擇要拆分的文本,檢查是否有分隔符或任何空白。
您將獲得30天試用版。 也將很快與您分享正則表達式查詢。
還可以檢出http://myregexp.com/或某些插件來驗證您的正則表達式查詢
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.