简体   繁体   English

perl在行中分割多个逗号

[英]perl split multiple commas in line

I am trying to split these values with colon separated 我正在尝试用冒号分隔这些值

my input: 我的输入:

user_agent="Mozilla/5.0 (X11; Linux x86_64; rv:23.0) Gecko/20100101 Firefox/23.0" request_id=bbfd6a1f-90c4-45g52-9e7c-db5 message="Authentication success"

I am using this code block 我正在使用此代码块

while ( my $line = <IN> ) {
    chomp $line;
    print "$line\n";
    my @values = split( /\s+/, $line );

    foreach $data (@values) {
        chomp $data;
        ( $key, $value ) = split( /=/, $data );
        $key =~ s/\s+//g;
        $key =~ s/"//g;
    }
}

I am receiving this output, it take the space between the values, how to split the keys and values exactly from the above input 我收到此输出,它占用了值之间的空格,如何从上述输入中准确地拆分键和值

_1;
Linux
x86_64;
rv:23.0)
Gecko/20100101es,OU
(X1

Thanks in Advance 提前致谢

Assuming that " would not appear as valid value character, 假设"不会显示为有效值字符,

my %hash;
while (my $line = <IN>)
{
  $hash{$1} = ($2 // $3) while $line =~ /(\w+)=(?: "(.+?)" | (\S+) )/xg;
}

This solution makes use of the (?|) matching groups introduced in perl 5.10 (I think). 该解决方案利用了perl 5.10中引入的(?|)匹配组(我认为)。 If you don't want to save into a hash, you can extend the line with the while loop. 如果您不想保存到哈希中,则可以使用while循环来扩展行。 inside the while , the key is in $1 and the value is in $2 . while ,键在$1 ,值在$2

#!/usr/bin/env perl

use warnings;
use strict;
use 5.01;

while (<DATA>){
  chomp;
  my %header;
  $header{$1} = $2 while (/\G\s*(\S+)=(?|"([^"]*)"|(\S*))/g); #extend here
  printf "%9s => %s\n", $_, $header{$_} for keys %header;
}


__DATA__
user_agent="Mozilla/5.0 (X11; Linux x86_64; rv:23.0) Gecko/20100101 Firefox/23.0" request_id=bbfd6a1f-90c4-45g52-9e7c-db5 message="Authentication success"

This prints: 打印:

message    => Authentication success
user_agent => Mozilla/5.0 (X11; Linux x86_64; rv:23.0) Gecko/20100101 Firefox/23.0
request_id => bbfd6a1f-90c4-45g52-9e7c-db5

If the quoting gets more complex, you should look at Text::Balanced with it's extract_quotelike routine. 如果引用变得更复杂,则应查看Text::Balancedextract_quotelike例程。

You can use the perlretut - Alternative capture group numbering to capture values as either quote enclosed or non-spaces. 您可以使用perlretut-备用捕获组编号将值捕获为带引号的空格或非空格。

Then because the capture groups are arranged in key value pairs, it's possible to directly initialize your hash like so: 然后,由于捕获组按键值对排列,因此可以像这样直接初始化哈希:

use strict;
use warnings;

while (<DATA>) {
    chomp;
    my %hash = /\G([^=]+)=(?|"([^"]*)"|(\S*))\s*/g;

    use Data::Dump;
    dd \%hash;
}

__DATA__
user_agent="Mozilla/5.0 (X11; Linux x86_64; rv:23.0) Gecko/20100101 Firefox/23.0" request_id=bbfd6a1f-90c4-45g52-9e7c-db5 message="Authentication success"

Outputs: 输出:

{
  message    => "Authentication success",
  request_id => "bbfd6a1f-90c4-45g52-9e7c-db5",
  user_agent => "Mozilla/5.0 (X11; Linux x86_64; rv:23.0) Gecko/20100101 Firefox/23.0",
}

Live Demo 现场演示

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM