简体   繁体   中英

extract substring from string in perl

I have a string like below:

downCircuit received;TOKENS START;{"action":'"UPDATE","device_id":"CP0027829","link_index":"101","name":"uplink101","description":"link1-0/0/3","priority":"200","status":"DOWN","wan_status":"DOWN","vlan":"4094","vlan_description":"vlan4094-intf","topic":"uplinks","stream_timestamp":"1547015547","aws_host":"attwifi-poc-central.arubathena.com","aws_timestamp":"1547015547","customer_id":"6666778917"};TOKENS END

I wanted to extract the value of link_index from it. ie output should be 101 in this case. Can somebody please help on how to extract 101 from my string.

I have a string like below

What you have there is some JSON with extra cruft before and after it. So rather than struggling with regexes, the best idea would be to extract the actual JSON and then use a JSON parser to deal with it. Something like this:

#!/usr/bin/perl

use strict;
use warnings;
use feature 'say';

use JSON;

my $input = 'downCircuit received;TOKENS START;{"action":"UPDATE","device_id":"CP0027829","link_index":"101","name":"uplink101","description":"link1-0/0/3","priority":"200","status":"DOWN","wan_status":"DOWN","vlan":"4094","vlan_description":"vlan4094-intf","topic":"uplinks","stream_timestamp":"1547015547","aws_host":"attwifi-poc-central.arubathena.com","aws_timestamp":"1547015547","customer_id":"6666778917"};TOKENS END';

$input =~ s/.*START;//;
$input =~ s/;TOKENS END//;

my $data = JSON->new->decode($input);

say $data->{link_index};

As expected, this produces the output 101 .

Note: I think there's a typo in your question. At least, there's a syntax error in the JSON. I removed a single, unmatched quote character that you have before "UPDATE" .

You can use a simple regex like this:

"link_index":"(\d+)"

And then grab the content from capturing group

Working demo

my $str = 'downCircuit received;TOKENS START;{"action":\'"UPDATE","device_id":"CP0027829","link_index":"101","name":"uplink101","description":"link1-0/0/3","priority":"200","status":"DOWN","wan_status":"DOWN","vlan":"4094","vlan_description":"vlan4094-intf","topic":"uplinks","stream_timestamp":"1547015547","aws_host":"attwifi-poc-central.arubathena.com","aws_timestamp":"1547015547","customer_id":"6666778917"};TOKENS END';
my $regex = qr/"link_index":"(\d+)"/mp;

if ( $str =~ /$regex/g ) {
  print "Whole match is ${^MATCH} and its start/end positions can be obtained via \$-[0] and \$+[0]\n";
  print "Capture Group 1 is $1 and its start/end positions can be obtained via \$-[1] and \$+[1]\n";
  # print "Capture Group 2 is $2 ... and so on\n";
}

you could use a backreference:

print $1,"\\n" if /"link_index":"(\\d+)"/

in full context:

$string=q(downCircuit received;TOKENS START;{"action":'"UPDATE","device_id":"CP0027829","link_index":"101","name":"uplink101","description":"link1-0/0/3","priority":"200","status":"DOWN","wan_status":"DOWN","vlan":"4094","vlan_description":"vlan4094-intf","topic":"uplinks","stream_timestamp":"1547015547","aws_host":"attwifi-poc-central.arubathena.com","aws_timestamp":"1547015547","customer_id":"6666778917"};TOKENS END);
print $1,"\n" if $string =~ /"link_index":"(\d+)"/;

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM