简体   繁体   English

正则表达式提取 img src javascript

[英]regex extract img src javascript

I am trying to extract the img and src from a long html string.我正在尝试从长 html 字符串中提取 img 和 src 。

I know there are a lot of questions about how to do this, but I have tried and gotten the wrong result.我知道有很多关于如何做到这一点的问题,但我已经尝试并得到了错误的结果。 My question is just about contradicting results though.我的问题只是关于矛盾的结果。

I am using:我在用:

var url = "<img height=\"100\" src=\"data:image/png;base64,testurlhere\" width=\"200\"></img>";
var regexp = /<img[^>]+src\s*=\s*['"]([^'"]+)['"][^>]*>/g;
var src = url.match(regexp);

But this results in src not being extracted properly.但这会导致 src 无法正确提取。 I keep getting src = <img height="100" src="data:image/png;base64,testurlhere" width="200"></img> instead of data:image/png;base64,testurlhere我不断得到 src = <img height="100" src="data:image/png;base64,testurlhere" width="200"></img>而不是data:image/png;base64,testurlhere

However, when I try this on the regex tester at regex101, it extracts the src correctly.但是,当我在 regex101 的正则表达式测试器上尝试此操作时,它会正确提取 src。 What am I doing wrong?我究竟做错了什么? Is match() the wrong function to use> match()是不是使用了错误的函数>

If you need to get the whole img tags for some reason:如果您出于某种原因需要获取整个 img 标签:

const imgTags = html.match(/<img [^>]*src="[^"]*"[^>]*>/gm);

then you can extract the source link for every img tag in array like this:然后您可以像这样提取数组中每个 img 标签的源链接:

const sources = html.match(/<img [^>]*src="[^"]*"[^>]*>/gm)
                          .map(x => x.replace(/.*src="([^"]*)".*/, '$1'));

Not a big fan of using regex to parse html content, so here goes the longer way不喜欢使用正则表达式来解析 html 内容,所以这里走得更远

 var url = "<img height=\"100\" src=\"data:image/png;base64,testurlhere\" width=\"200\"></img>"; var tmp = document.createElement('div'); tmp.innerHTML = url; var src = tmp.querySelector('img').getAttribute('src'); snippet.log(src)
 <!-- Provides the `snippet` object, see http://meta.stackexchange.com/a/242144/134069 --> <script src="http://tjcrowder.github.io/simple-snippets-console/snippet.js"></script>

Try this:尝试这个:

var match = regexp.exec(url);
var src = match[1];
const src = url.slice(url.indexOf("src")).split('"')[1]

Regex gives me headaches.正则表达式让我头疼。 Boohoo.号泣。

Find the index of the src in the HTML string (named var url in the question), then slice it from there, and finally split the array from the " 's. The second item in the array is your src link.在 HTML 字符串中找到 src 的索引(在问题中命名为 var url),然后从那里对其进行切片,最后将数组从 " 中拆分出来。数组中的第二项是您的 src 链接。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM