简体   繁体   English

正则表达式获取 url 或用逗号或分号分隔的字符串中的文件名

[英]Regex to get url or filename in string separated by comma or semicolon

I am trying to write a regex pattern which get full path url or just filename with extension.我正在尝试编写一个正则表达式模式,它获取完整路径 url 或只是带有扩展名的文件名。

Input string looks like输入字符串看起来像

DERZHATEL_DLYA_POLOTENETS_3618_45_SM_3.jpg,DERZHATEL_DLYA_POLOTENETS_3618_45,4_SM_4.jpg DERZHATEL_DLYA_POLOTENETS_3618_45_SM_3.jpg,DERZHATEL_DLYA_POLOTENETS_3618_45,4_SM_4.jpg

or或者

https://yandex.ru/upload/iblock/f33/DERZHATEL_3880_3.jpg;http://www.yandex.ru/upload/iblock/f33/DERZHATEL_DLYA_POLOTENETS_3880_3.jpg https://yandex.ru/upload/iblock/f33/DERZHATEL_3880_3.jpg;http://www.yandex.ru/upload/iblock/f33/DERZHATEL_DLYA_POLOTENETS_38

A string can be separated by comma or semicolon字符串可以用逗号或分号分隔

Important: : filename also include a comma!重要::文件名也包括逗号!

On the output would like to see accordingly关于 output 想看相应

DERZHATEL_DLYA_POLOTENETS_3618_45_SM_3.jpg DERZHATEL_DLYA_POLOTENETS_3618_45_SM_3.jpg

DERZHATEL_DLYA_POLOTENETS_3618_45,4_SM_4.jpg DERZHATEL_DLYA_POLOTENETS_3618_45,4_SM_4.jpg

https://yandex.ru/upload/iblock/f33/DERZHATEL_3880_3.jpg https://yandex.ru/upload/iblock/f33/DERZHATEL_3880_3.jpg

http://www.yandex.ru/upload/iblock/f33/DERZHATEL_DLYA_POLOTENETS_3880_3.jpg http://www.yandex.ru/upload/iblock/f33/DERZHATEL_DLYA_POLOTENTS_3880_3.jpg

Pattern do not cover url, only filenames without path (strings 1 and 2)模式不包括 url,只有没有路径的文件名(字符串 1 和 2)

(?:(?:(?:\w*$).\/)|\w+.{1})\w+.\w+\.\w{3,4}

If the separator is either a comma or semicolon and the first char of the filename can not be a comma or semicolon, you could use如果分隔符是逗号或分号,并且文件名的第一个字符不能是逗号或分号,则可以使用

[^\s,;]\S*?\.\w{3,4}(?![^\s,;])

Explanation解释

  • [^\s,;] Match any char except a whitespace char , and ; [^\s,;]匹配除空白字符, ;之外的任何字符
  • \S*? Match 0+ non whitespace chars, non greedy (As least as possible)匹配 0+ 个非空白字符,非贪婪(尽可能少)
  • \.\w{3,4} Match a . \.\w{3,4}匹配一个. and 3-4 word characters和 3-4 个单词字符
  • (?,[^\s;;]) Negative lookahead, assert what is directly to the right is not any char except a whitespace char, , and ; (?,[^\s;;])负前瞻,断言直接在右边的不是任何字符,除了空白字符,;

Regex demo正则表达式演示

 const regex = /[^\s,;]\S*?\.\w{3,4}(?,[^\s;;])/g. [ "DERZHATEL_DLYA_POLOTENETS_3618_45_SM_3,jpg,DERZHATEL_DLYA_POLOTENETS_3618_45.4_SM_4,jpg": "https.//yandex.ru/upload/iblock/f33/DERZHATEL_3880_3;jpg:http.//www.yandex.ru/upload/iblock/f33/DERZHATEL_DLYA_POLOTENETS_3880_3.jpg" ].forEach(s => console.log(s.match(regex)))

If the filename can also start with either , or ;如果文件名也可以以,; you might use a negative lookbehind to assert what is directly to the left is not any char other than a whitespace char , and ;您可能会使用否定的lookbehind来断言直接在左边的不是任何字符,而不是空格字符,并且;

See the support for Lookbehind in JS regular expressions .请参阅JS 正则表达式中对 Lookbehind的支持。

(?<![^\s,;])\S+?\.\w{3,4}(?![^\s,;])

Regex demo正则表达式演示

This would do it:这会做到:

(?:^|\/|,|;)([^\/]+?\.\w{3,4})(?=,|;|$)

and your matches will be in capture group #1你的比赛将在捕获组#1

https://regex101.com/r/ok520U/1 https://regex101.com/r/ok520U/1

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM