簡體   English   中英

如何使用Perl正則表達式解析lshw的輸出?

[英]How can I parse the output of lshw using Perl regexes?

我正在嘗試使用此代碼將lshw輸出解析為哈希,到目前為止仍然有效。

  use strict;
  use warnings;

  my (%lshw,$key,$value);
  while (<>){
  s/#.*//;                # no comments
  s/^\s+//;               # no leading whites
  s/\s+$//;               # no trailing whites
  next unless length;     # anything left?
  if (/(?<key>.*?):\s+(?<value>.*)/x){
    $lshw{$+{key}} = $+{value};
  }
}

# remove white spaces in hash keys
for $key (keys %lshw){
  $value = delete $lshw{$key};
  for ($key){
    s/\s+//g;
   }
  $lshw{$key} = $value;
  }

my $logname   = $lshw{'logicalname'};
print "Logical name\t $logname\n";

但是當我遇到如下配置時我很努力:

clock: 33Mhz 
width: 32 bits 
capacity: 1Gbit/s 
configuration:autonegotiation=on broadcast=yes driver=igb driverversion=5.3.0-k duplex=full firmware=1.63, 0x800009fa ip=[REMOVED] latency=0 link=yes multicast=yes port=twisted pair speed=1Gbit/s`

我正在嘗試一種方法,但是沒有找到如何拆分鍵/值的解決方案,因為它包含多個單詞值,例如port=twisted pair 關鍵永遠是一個單詞。

誰能給我一個提示如何解決這個問題?

(感謝simbabque提供嚴格/警告提示)

您需要的是捕獲等號后的所有字符,然后不跟隨模式somekeyname=

#!/usr/bin/env perl

use strict;
use warnings;

my $s = q{configuration: autonegotiation=on broadcast=yes driver=igb driverversion=5.3.0-k duplex=full firmware=1.63, 0x800009fa ip=[REMOVED] latency=0 link=yes multicast=yes port=twisted pair speed=1Gbit/s};

my ($key, $rest) = split /:\s*/, $s, 2;

my %params = ($rest =~ / (\w+) = ((?:. (?! \w+ = ))+) /gx);

use YAML::XS;
print Dump \%params;

輸出:

---
autonegotiation: on
broadcast: yes
driver: igb
driverversion: 5.3.0-k
duplex: full
firmware: 1.63, 0x800009fa
ip: '[REMOVED]'
latency: '0'
link: yes
multicast: yes
port: twisted pair
speed: 1Gbit/s

此外,可以改善您的初始循環:

 while (<>) {
     next if /^#/; # skip comments
     /\S/ or next; # skip blank lines
     s/^\s+//;
     s/\s+\z//;
     # ...
}

您只需要按如下方式split配置字符串

use strict;
use warnings 'all';
use feature 'say';

my $s = 'configuration: autonegotiation=on broadcast=yes driver=igb driverversion=5.3.0-k duplex=full firmware=1.63, 0x800009fa ip=[REMOVED] latency=0 link=yes multicast=yes port=twisted pair speed=1Gbit/s';

say for split /\s+(?=[^\s=]+=)/, $s;

輸出

configuration:
autonegotiation=on
broadcast=yes
driver=igb
driverversion=5.3.0-k
duplex=full
firmware=1.63, 0x800009fa
ip=[REMOVED]
latency=0
link=yes
multicast=yes
port=twisted pair
speed=1Gbit/s

現在,您將獲得一個鍵及其值的列表,這些鍵及其值已正確地除以鍵名稱。 這應該很容易處理

Borodin的方法就是這樣。

萬一您想用regexp解析它,它將起作用並且將鍵與其值分開。

#!/usr/bin/env perl

use warnings FATAL => 'all';
use strict;
my $s = 'configuration: autonegotiation=on broadcast=yes driver=igb driverversion=5.3.0-k duplex=full firmware=1.63, 0x800009fa ip=[REMOVED] latency=0 link=yes multicast=yes port=twisted pair speed=1Gbit/s';

while ($s =~ m/(?<key>[A-Za-z0-9]+)=(?<value>([\/\[\]A-Za-z0-9., -]+)(?= [a-z]+)|([\/\[\]A-Za-z0-9., -]+))/g) {
    print "$+{key} >>  $+{value}\n";
    $s =~ s/$+{key}//;
}

輸出量

autonegotiation >>  on
broadcast >>  yes
driver >>  igb
driverversion >>  5.3.0-k
duplex >>  full
firmware >>  1.63, 0x800009fa
ip >>  [REMOVED]
latency >>  0
link >>  yes
multicast >>  yes
port >>  twisted pair
speed >>  1Gbit/s

優點

  • 鍵/值分離

缺點

  • 正則表達式中昂貴的正向提前

重構提案

  • 消除正則表達式[\\/\\[\\]A-Za-z0-9., -]的復雜字符組[\\/\\[\\]A-Za-z0-9., -]
  • 擺脫循環中找到的模式的替換

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM