简体   繁体   English

正则表达式提取 src 属性

[英]Regex to extract src attribute

I am trying to match every src attribute that ends with jpg or png or gif and extract src string inside.我试图匹配每个以 jpg 或 png 或 gif 结尾的 src 属性,并在其中提取 src 字符串。 I am not sure if the following regex that I came up with is correct, but it does give me src attributes with addresses.我不确定我想出的以下正则表达式是否正确,但它确实为我提供了带有地址的 src 属性。 My question has to do with the possible problem of the following regex and how I can extract only the src string.我的问题与以下正则表达式的可能问题以及我如何仅提取 src 字符串有关。

/src\s*=\s*(["'][^"']+(jpg|png|gif)\b)/g;

First of all, your regex is trying to do too much.首先,您的正则表达式尝试做的太多了。 Start by doing something like:首先执行以下操作:

function img_find() {
    var imgs = document.getElementsByTagName("img");
    var imgSrcs = [];

    for (var i = 0; i < imgs.length; i++) {
        imgSrcs.push(imgs[i].src);
    }

    return imgSrcs;
}

Now, your regex has a lot less to deal with.现在,您要处理的正则表达式要少得多。 (No whitespace, single vs double quotes, and so on.) (没有空格、单引号和双引号等等。)

Please read this , and don't (except for very simple situations) try to use regex for parsing raw HTML :)请阅读本文,不要(除了非常简单的情况)尝试使用正则表达式来解析原始 HTML :)

So, given an array of image sources, you just need to select the jpg / png / gif ones:因此,给定一组图像源,您只需要选择jpg / png / gif图像源:

/(jpg|png|gif)$)/i;

And then grab their file names, without the extension: (There are many ways of doing this; here's just one thing I've thrown together...)然后获取他们的文件名,没有扩展名:(有很多方法可以做到这一点;这只是我拼凑起来的一件事......)

/(.*)\.[^.]+)/;

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM