简体   繁体   English

正则表达式以提取HREFS

[英]Regular Expression to extract HREFS

I'm looking for a regular expression that can extract the href from this: 我正在寻找可以从中提取href的正则表达式:

<a href="/tr/blog.php?post=3593&user=930">

There are hundreds of links on the page so I need to extract only those that contain 页面上有数百个链接,因此我只需要提取包含以下内容的链接

/tr/blog.php

So in the end I should be left with a list of links that start in /tr/blog 所以最后我应该留下以/ tr / blog开头的链接列表

Thanks for any help. 谢谢你的帮助。 It's really puzzling me. 这真的让我感到困惑。

This is the RegEx I am currently using, but it matches all. 这是我当前正在使用的RegEx,但它与所有匹配。

/href\s*=\s*\"*[^\">]*/ig;

您可以尝试使用href=\\"(/tr/blog.php[^"]*)\\" (将捕获到第1组),但是通常不应该使用regex来解析HTML

This is a bit late, but now that it's the future, you don't even need the regular expression: 这有点晚了,但是现在已经是将来,您甚至不需要正则表达式:

document.querySelectorAll("a[href*='/tr/blog.php']") will give you the links that contain that string, or you can find those that begin with that string document.querySelectorAll("[href^='/tr/blog.php']") . document.querySelectorAll("a[href*='/tr/blog.php']")将为您提供包含该字符串的链接,或者您可以找到以该字符串开头的链接document.querySelectorAll("[href^='/tr/blog.php']")

<body> <a href="/tr/blog.php?lol">fslk</a> 

<script>

    var anchors = document.getElementsByTagName('a'), captured = [];

    for ( var i = 0, l = anchors.length, href, r = /tr\/blog\.php/; i<l; ++i ) {
         href = this.href;
         if ( r.test( href ) ) {
             captured.push( this )
         }
    }

    // do what u want with captured links
    for ( var l = captured.length; l--; ) {
        alert( captured[l].href )
    }

</script>

</body>

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM