Perl / Curl脚本中的Perl正则表达式

Question

I'm not all that sure how this works/what it means... 我不确定这是如何工作的/这意味着什么......

my ($value) = ($out =~ /currentvalue[^>]*>([^<]+)/);

So basically, thats part of a CURL/PERL script, it goes onto www.example.com, and finds 所以基本上，这是CURL / PERL脚本的一部分，它进入www.example.com，并找到
<span id="currentvalue"> GETS THIS VALUE </span>
in the pages html. 在页面html中。

What exactly does the [^>]*>([^<]+)/) part of the script do? 脚本的[^>]*>([^<]+)/)部分究竟是什么？ Does it define that its looking for span id=".." ? 它是否定义了它寻找span id =“..”？

Where can I learn more about the [^>]*>([^<]+)/) functions? 我在哪里可以了解更多关于[^>] *>（[^ <] +）/）函数的信息？

Answer 1

/.../ aka m/.../ is a the match operator. /.../ aka m/.../是匹配运算符。 It checks if its operand (on the LHS of =~ ) matches the regular expression within the literal. 它检查其操作数（在=~的LHS上）是否与文字中的正则表达式匹配。 Operators are documented in perlop . 运算符记录在perlop中。 (Go down to "m/PATTERN/".) Regular expressions are documented in perlre . （转到“m / PATTERN /”。）正则表达式记录在perlre中。

As for the regular expression used here, 至于这里使用的正则表达式，

$ perl -MYAPE::Regex::Explain \
   -e'print YAPE::Regex::Explain->new($ARGV[0])->explain' \
      'currentvalue[^>]*>([^<]+)'
The regular expression:

(?-imsx:currentvalue[^>]*>([^<]+))

matches as follows:

NODE                     EXPLANATION
----------------------------------------------------------------------
(?-imsx:                 group, but do not capture (case-sensitive)
                         (with ^ and $ matching normally) (with . not
                         matching \n) (matching whitespace and #
                         normally):
----------------------------------------------------------------------
  currentvalue             'currentvalue'
----------------------------------------------------------------------
  [^>]*                    any character except: '>' (0 or more times
                           (matching the most amount possible))
----------------------------------------------------------------------
  >                        '>'
----------------------------------------------------------------------
  (                        group and capture to \1:
----------------------------------------------------------------------
    [^<]+                    any character except: '<' (1 or more
                             times (matching the most amount
                             possible))
----------------------------------------------------------------------
  )                        end of \1
----------------------------------------------------------------------
)                        end of grouping
----------------------------------------------------------------------

Answer 2

This is plain vanilla Perl regexp. 这是普通的Perilla regexp。 See this tutorial 请参阅本教程

  /              # Start of regexp  
  currentvalue   # Matches the string 'currentvalue'
  [^>]*          # Matches 0 or more characters which is not '>'
  >              # Matches >
  (              # Captures match enclosed in () to Perl built-in variable $1 
  [^<]+          # Matches 1 or more characters which  is not '<'  
  )              # End of group $1 
  /              # End of regexp

Perl / Curl脚本中的Perl正则表达式

问题描述

2 个解决方案

解决方案1
8 已采纳 2013-11-05 16:30:05

解决方案2
7 2013-11-05 16:30:12

Perl / Curl脚本中的Perl正则表达式

问题描述

2 个解决方案

解决方案1 8 已采纳 2013-11-05 16:30:05

解决方案2 7 2013-11-05 16:30:12

解决方案1
8 已采纳 2013-11-05 16:30:05

解决方案2
7 2013-11-05 16:30:12