如何使用Perl的WWW :: Mechanize从页面中提取所有链接（不包括一个链接）？

Question

I'm trying to use WWW::Mechanize to extract some links from the HTML page using find_all_links() method. 我试图使用WWW :: Mechanize使用find_all_links()方法从HTML页面提取一些链接。 It supports matching on these criterias: 它支持根据以下条件进行匹配：

text 文本
text_regex text_regex
url 网址
url_regex url_regex
url_abs url_abs
url_abs_regex url_abs_regex
... ...

How can I extract all links except one that has text "xyz"? 我如何提取除具有文本“ xyz”的链接以外的所有链接？

Answer 1

You can use the 'text_regex' criteria: 您可以使用'text_regex'条件：

$mech->find_all_links(text_regex => qr/^(?!xyz$).*$/);

See perldoc perlre for more on negative look-ahead assertion. 有关否定的前瞻性断言的更多信息，请参见perldoc perlre 。

Answer 2

为什么不获取所有链接，然后使用'grep'跳过不需要的链接？

如何使用Perl的WWW :: Mechanize从页面中提取所有链接（不包括一个链接）？

问题描述

2 个解决方案

解决方案1
6 已采纳 2010-03-26 12:31:40

解决方案2
1 2010-03-26 13:50:48

如何使用Perl的WWW :: Mechanize从页面中提取所有链接（不包括一个链接）？

问题描述

2 个解决方案

解决方案1 6 已采纳 2010-03-26 12:31:40

解决方案2 1 2010-03-26 13:50:48

解决方案1
6 已采纳 2010-03-26 12:31:40

解决方案2
1 2010-03-26 13:50:48