简体   繁体   中英

Finding a particular value in an HTTP response using Perl

I have a little script in Perl, HTTP POST request

my $request =  $ua->post( $url, [ 'country' => 10, 'evalprice' => 0 ] );
my $response = $request->content;

Now I know that in the response there will be this part, which appears only once

:&nbsp;<b>9570&nbsp;USD

I want to take only the number 9570 (or whatever it will be), I don't know how to search for

:&nbsp;<b>

and then just take the part after that and before

&nbsp;USD

I guess regular expressions will help, but I can't figure out how to use them here.

You were on the right track with the regular expression. You only need one expression, and since your string is straightforward, you don't even need a very complicated one.

my $content =~ m/:&nbsp;<b>([.\d]+)&nbsp;USD/;
my $price = $1;

The m// is the matching operator. Together wil =~ it tells Perl to do a regular expression to your variable $content . We have a capture group ( () ) that contains the price, and it's contents will go into $1 . The [.\\d+] is a group of characters. The dot is just a dot (your price might have cents), and the \\d means all digits ( 0 - 9 ). The + says there may be lots of these characters, but at least one.

Use code like this (removing HTML entities is nice, but optional):

use HTML::Entities;

my $content = ":&nbsp;<b>9570&nbsp;USD";
my $decoded = decode_entities($content); # replace &nbsp; to spaces
my ($price) = ($decoded =~ /<b>(\d+)\s*USD/);
print "price = $price\n";

The safest way to parse HTML is with the help of a proper CPAN module. But a simple alternative (if the response is simple) may be this;

use strict;
use warnings;

my $str = ":&nbsp;<b>9570&nbsp;USD";

if( $str =~ m/:&nbsp;<b>(\d+)&nbsp;/ ) {
   print $1, "\n";
}

I have used a regular expression, and the number is at $1 when a match is found.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM