简体   繁体   中英

Javascript regex remove everything after .extension

I have the following sample URL which I need to sanitize

http://image.s5a.com/is/image/saks/0447522591096_647x329.jpg" border="0" params="">

into

http://image.s5a.com/is/image/saks/0447522591096_647x329.jpg

My question is, which regex should I use to flexibly remove everything after the .extension, regardless of whether its .jpg, or .png or .jpeg?

Also the texts and symbols after the extension will all be different.

Thanks

(.*?(?:jpg|png|jpeg))|.*

Try this.Replace by $1 .See demo.

http://regex101.com/r/rQ6mK9/47

You can use:

var s = 'http://image.s5a.com/is/image/saks/0447522591096_647x329.jpg" border="0" params="">';
var r = s.replace(/^(.+?\.(png|jpe?g)).*$/i, '$1');
//=> http://image.s5a.com/is/image/saks/0447522591096_647x329.jpg

An alternative to regex would be to just use basic string parsing.

var fullUrl = 'http://image.s5a.com/is/image/saks/0447522591096_647x329.jpg" border="0" params="">';
var baseUrl = fullUrl.split(' ')[0];

Edit: you may also want to decode the url so you don't get tripped up by % encoding.

var fullUrl = 'http://image.s5a.com/is/image/saks/0447522591096_647x329.jpg" border="0" params="">';
var fullUrl = decodeURI(fullUrl);
var baseUrl = fullUrl.split(' ')[0];

You shouldn't try to do things like this yourself. All languages have libraries for reading HTML, and they're more reliable than doing it yourself. If this is client-side javascript, you could use jQuery; then, if element is a jquery object representing the HTML element, element.attr('src') will be the value of the src attribute.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM