I am trying to parse some HTML of a directory listing page using c#. That page has many file urls like "0220109_120548.046.jpg" but has also others like "0220109_120548.046-445x265.jpg". They are the same picture but one has its dimensions in the name.
I need a regex to match only the urls of those files without the dimensions.
I tried this one: href="^"*.(gif|jpg|png)"
but its not working.
the regex101 url: https://regex101.com/r/APS9NY/1
Here is one way to do so:
href=\"[^\"]*?(?<!\d{2,4}x\d{2,4})\.(gif|jpg|png)\"
See here for the online demo.
href=\"
: Matches href="
[^\"]*?
: Any character that isn't "
, between zero and unlimited times, as few as possible. (?<!)
: Negative lookbehind.
\d{2,4}
: Matches between 2 and 4 digits. x
: Matches x
. \.
: Matches .
. (gif|jpg|png)
: Matches either gif
, jpg
or png
. \"
: Matches "
.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.