简体   繁体   中英

Perl regular expression in Perl/Curl script

I'm not all that sure how this works/what it means...

my ($value) = ($out =~ /currentvalue[^>]*>([^<]+)/);

So basically, thats part of a CURL/PERL script, it goes onto www.example.com, and finds
<span id="currentvalue"> GETS THIS VALUE </span>
in the pages html.

What exactly does the [^>]*>([^<]+)/) part of the script do? Does it define that its looking for span id=".." ?

Where can I learn more about the [^>]*>([^<]+)/) functions?

/.../ aka m/.../ is a the match operator. It checks if its operand (on the LHS of =~ ) matches the regular expression within the literal. Operators are documented in perlop . (Go down to "m/PATTERN/".) Regular expressions are documented in perlre .

As for the regular expression used here,

$ perl -MYAPE::Regex::Explain \
   -e'print YAPE::Regex::Explain->new($ARGV[0])->explain' \
The regular expression:


matches as follows:

NODE                     EXPLANATION
(?-imsx:                 group, but do not capture (case-sensitive)
                         (with ^ and $ matching normally) (with . not
                         matching \n) (matching whitespace and #
  currentvalue             'currentvalue'
  [^>]*                    any character except: '>' (0 or more times
                           (matching the most amount possible))
  >                        '>'
  (                        group and capture to \1:
    [^<]+                    any character except: '<' (1 or more
                             times (matching the most amount
  )                        end of \1
)                        end of grouping

This is plain vanilla Perl regexp. See this tutorial

  /              # Start of regexp  
  currentvalue   # Matches the string 'currentvalue'
  [^>]*          # Matches 0 or more characters which is not '>'
  >              # Matches >
  (              # Captures match enclosed in () to Perl built-in variable $1 
  [^<]+          # Matches 1 or more characters which  is not '<'  
  )              # End of group $1 
  /              # End of regexp

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

粤ICP备18138465号  © 2020-2024 STACKOOM.COM