简体   繁体   English

从Perl中的字符串中提取子字符串

[英]extract substring from string in perl

I have a string like below: 我有一个类似下面的字符串:

downCircuit received;TOKENS START;{"action":'"UPDATE","device_id":"CP0027829","link_index":"101","name":"uplink101","description":"link1-0/0/3","priority":"200","status":"DOWN","wan_status":"DOWN","vlan":"4094","vlan_description":"vlan4094-intf","topic":"uplinks","stream_timestamp":"1547015547","aws_host":"attwifi-poc-central.arubathena.com","aws_timestamp":"1547015547","customer_id":"6666778917"};TOKENS END

I wanted to extract the value of link_index from it. 我想从中提取link_index的值。 ie output should be 101 in this case. 即在这种情况下输出应为101。 Can somebody please help on how to extract 101 from my string. 有人可以帮忙从我的琴弦中提取101吗。

I have a string like below 我有一个像下面的字符串

What you have there is some JSON with extra cruft before and after it. 您所拥有的是一些带有前后多余内容的JSON。 So rather than struggling with regexes, the best idea would be to extract the actual JSON and then use a JSON parser to deal with it. 因此,最好不要提取正则表达式,而最好是提取实际的JSON,然后使用JSON解析器进行处理。 Something like this: 像这样:

#!/usr/bin/perl

use strict;
use warnings;
use feature 'say';

use JSON;

my $input = 'downCircuit received;TOKENS START;{"action":"UPDATE","device_id":"CP0027829","link_index":"101","name":"uplink101","description":"link1-0/0/3","priority":"200","status":"DOWN","wan_status":"DOWN","vlan":"4094","vlan_description":"vlan4094-intf","topic":"uplinks","stream_timestamp":"1547015547","aws_host":"attwifi-poc-central.arubathena.com","aws_timestamp":"1547015547","customer_id":"6666778917"};TOKENS END';

$input =~ s/.*START;//;
$input =~ s/;TOKENS END//;

my $data = JSON->new->decode($input);

say $data->{link_index};

As expected, this produces the output 101 . 如预期的那样,这产生了输出101

Note: I think there's a typo in your question. 注意:我认为您的问题中有错别字。 At least, there's a syntax error in the JSON. 至少,JSON中存在语法错误。 I removed a single, unmatched quote character that you have before "UPDATE" . 我删除了您在"UPDATE"之前没有的单个引号字符。

You can use a simple regex like this: 您可以使用一个简单的正则表达式,如下所示:

"link_index":"(\d+)"

And then grab the content from capturing group 然后从捕获组中获取内容

Working demo 工作演示

my $str = 'downCircuit received;TOKENS START;{"action":\'"UPDATE","device_id":"CP0027829","link_index":"101","name":"uplink101","description":"link1-0/0/3","priority":"200","status":"DOWN","wan_status":"DOWN","vlan":"4094","vlan_description":"vlan4094-intf","topic":"uplinks","stream_timestamp":"1547015547","aws_host":"attwifi-poc-central.arubathena.com","aws_timestamp":"1547015547","customer_id":"6666778917"};TOKENS END';
my $regex = qr/"link_index":"(\d+)"/mp;

if ( $str =~ /$regex/g ) {
  print "Whole match is ${^MATCH} and its start/end positions can be obtained via \$-[0] and \$+[0]\n";
  print "Capture Group 1 is $1 and its start/end positions can be obtained via \$-[1] and \$+[1]\n";
  # print "Capture Group 2 is $2 ... and so on\n";
}

you could use a backreference: 您可以使用反向引用:

print $1,"\\n" if /"link_index":"(\\d+)"/

in full context: 在全文中:

$string=q(downCircuit received;TOKENS START;{"action":'"UPDATE","device_id":"CP0027829","link_index":"101","name":"uplink101","description":"link1-0/0/3","priority":"200","status":"DOWN","wan_status":"DOWN","vlan":"4094","vlan_description":"vlan4094-intf","topic":"uplinks","stream_timestamp":"1547015547","aws_host":"attwifi-poc-central.arubathena.com","aws_timestamp":"1547015547","customer_id":"6666778917"};TOKENS END);
print $1,"\n" if $string =~ /"link_index":"(\d+)"/;

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM