簡體   English   中英

正則表達式:匹配指定字符串之間的所有匹配項

[英]Regex: Match all occurrences between specified strings

我正在處理一堆引用圖像文件名的文本文件。 這些文件名已經過清理(將小寫字母和空格替換為連字符)-但是引用它們的文本卻沒有。

我需要像這樣轉換字符串:

(image: uploaded IMAGE.jpg caption: this is my caption)
(image: uploaded IMAGE copy.jpeg caption: this is my caption)
(image: IMG_6087.png caption: this is my caption)
(image: IMG_6087 copy.gif)
(image: IMG_9999_copy.jpg)
(image: somehow, a comma.jpg)
(image: other ridic'ulous characters!.jpg)

至:

(image: uploaded-image.jpg caption: this is my caption)
(image: uploaded-image-copy.jpeg caption: this is my caption)
(image: img_6087.png caption: this is my caption)
(image: img_6087-copy.gif)
(image: img_9999_copy.jpg)
(image: somehow-a-comma.jpg)
(image: other-ridiculous-characters.jpg)

這些字符串是較大的文本塊的一部分,但都位於各自的行上,如下所示:

This is not a short guide to write about art. Go in, out of the window, inside New York’s stars qualities, dreams and schemes. People are gathered together, brewing coffee — you have seen their faces? The artists in Manhattan.

(image: manhattan photo.jpg)

Drive till sunset and say goodbye to your body, because this is not a photograph. I saw sixteen americans, raised by wolves, probably lost in paradise city. I found your head — Do you still want it?

我正在使用Sublime文本,並計划進行多次替換操作:

  1. 帶空格
  2. 去除不是字母數字或_或-的字符
  3. 小寫

但是我無法捕獲兩個定界符之間的所有實例。

(?<=^\\(image: )[what do I do here??](?=\\.jpe?g|png|gif)

您可以使用非貪婪的所有人.*?

因此^\\(image: (.*?\\.(:?jpe?g|png|gif))捕獲包含擴展名的文件名

您可以使用以下方法獲取文件名:

(?<=image:\s)([^.]++)(?=\.jpe?g|\.png|\.gif)

之后,轉換取決於您使用的語言。根據需要添加文件擴展名。 現在,您支持jpgjpegpnggif

這是在PHP中完成此工作的方法

<?php
$string =
"This is not a short guide to write about art. Go in, out of the window, inside New York’s stars qualities, dreams and schemes. People are gathered together, brewing coffee — you have seen their faces? The artists in Manhattan.

(image: uploaded IMAGE.jpg caption: this is my caption)
This is not a short guide to write about art. Go in, out of the window, inside New York’s stars qualities, dreams and schemes. People are gathered together, brewing coffee — you have seen their faces? The artists in Manhattan.

(image: uploaded IMAGE copy.jpeg caption: this is my caption)
(image: IMG_6087.png caption: this is my caption)
(image: IMG_6087 copy.gif) blah blah
(image: IMG_9999_copy.jpg)
(image: somehow, a comma.jpg)
(image: other ridic'ulous characters!.jpg)";

echo preg_replace_callback('~(?<=\(image: )(.*?)\.(jpg|jpeg|png|gif)~', function($matches)
{
    return preg_replace('~\W~', '-', stripslashes(strtolower($matches[1]))) . ".$matches[2]";
}, $string);

?>

[編輯]添加正則表達式說明:

  • (?<=image: ) :):是令人反感的-因此請檢查'image:'的存在,但不能捕獲。
  • (.*?) :以貪婪的方式捕獲圖像擴展名之前的所有內容-因此匹配的文本越少越好。
  • \\.(jpg|jpeg|png|gif) :將匹配. 從字面上看+給定的擴展之一-並捕獲擴展以重用。
  • ~ :是分隔符,這種選擇只是因為它是在字符串很少使用,不需要\\/
  • \\W :與\\w相反,它將匹配任何非字母數字字符。

將輸出(在視圖源中):

This is not a short guide to write about art. Go in, out of the window, inside New York’s stars qualities, dreams and schemes. People are gathered together, brewing coffee — you have seen their faces? The artists in Manhattan.

(image: uploaded-image.jpg caption: this is my caption)
This is not a short guide to write about art. Go in, out of the window, inside New York’s stars qualities, dreams and schemes. People are gathered together, brewing coffee — you have seen their faces? The artists in Manhattan.

(image: uploaded-image-copy.jpeg caption: this is my caption)
(image: img_6087.png caption: this is my caption)
(image: img_6087-copy.gif) blah blah
(image: img_9999_copy.jpg)
(image: somehow--a-comma.jpg)
(image: other-ridic-ulous-characters-.jpg)

然后,您可以使用str_replace()在回調中微調您想將什么字符轉換成什么字符。

希望能幫助到你! ;)

您可以嘗試Jetbrains webstrom前端IDE嗎? 它提供了許多以可讀方式實現任何正則表達式操作的功能。 選擇要拆分的文本,檢查是否有分隔符或任何空白。

您將獲得30天試用版。 也將很快與您分享正則表達式查詢。

還可以檢出http://myregexp.com/或某些插件來驗證您的正則表達式查詢

在線正則表達式編輯器

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM