简体   繁体   中英

Regex to match specific file names

I am trying to parse some HTML of a directory listing page using c#. That page has many file urls like "0220109_120548.046.jpg" but has also others like "0220109_120548.046-445x265.jpg". They are the same picture but one has its dimensions in the name.

I need a regex to match only the urls of those files without the dimensions.

I tried this one: href="^"*.(gif|jpg|png)"

but its not working.

the regex101 url: https://regex101.com/r/APS9NY/1

Here is one way to do so:

href=\"[^\"]*?(?<!\d{2,4}x\d{2,4})\.(gif|jpg|png)\"

See here for the online demo.


  • href=\" : Matches href="
  • [^\"]*? : Any character that isn't " , between zero and unlimited times, as few as possible.
  • (?<!) : Negative lookbehind.
    • \d{2,4} : Matches between 2 and 4 digits.
    • x : Matches x .
  • \. : Matches . .
  • (gif|jpg|png) : Matches either gif , jpg or png .
  • \" : Matches " .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM