I have a little script in Perl, HTTP POST request
my $request = $ua->post( $url, [ 'country' => 10, 'evalprice' => 0 ] );
my $response = $request->content;
Now I know that in the response there will be this part, which appears only once
: <b>9570 USD
I want to take only the number 9570 (or whatever it will be), I don't know how to search for
: <b>
and then just take the part after that and before
USD
I guess regular expressions will help, but I can't figure out how to use them here.
You were on the right track with the regular expression. You only need one expression, and since your string is straightforward, you don't even need a very complicated one.
my $content =~ m/: <b>([.\d]+) USD/;
my $price = $1;
The m//
is the matching operator. Together wil =~
it tells Perl to do a regular expression to your variable $content
. We have a capture group ( ()
) that contains the price, and it's contents will go into $1
. The [.\\d+]
is a group of characters. The dot is just a dot (your price might have cents), and the \\d
means all digits ( 0
- 9
). The +
says there may be lots of these characters, but at least one.
Use code like this (removing HTML entities is nice, but optional):
use HTML::Entities;
my $content = ": <b>9570 USD";
my $decoded = decode_entities($content); # replace to spaces
my ($price) = ($decoded =~ /<b>(\d+)\s*USD/);
print "price = $price\n";
The safest way to parse HTML is with the help of a proper CPAN module. But a simple alternative (if the response is simple) may be this;
use strict;
use warnings;
my $str = ": <b>9570 USD";
if( $str =~ m/: <b>(\d+) / ) {
print $1, "\n";
}
I have used a regular expression, and the number is at $1
when a match is found.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.