简体   繁体   English

使用Perl中的正则表达式列出另一个Perl脚本中的变量

[英]Using a regular expression in Perl to list variables from another Perl script

My thoughts on how to grab all scalars and arrays out of a Perl file went along the lines of: 我对如何从Perl文件中获取所有标量和数组的想法遵循以下方针:

open (InFile, "SomeScript.pl");
@InArray = <InFile>;
@OutArray = {};
close (InFile);
$ArrayCount = @InArray;
open (OutFile, ">outfile.txt");
for ($x=0; $x<=$ArrayCount; $x++){
$Testline = @InArray[$x];

if($Testline =~ m/((@|\$)[A-Z]+)/i){
    $Outline = "$1\n";  
    push @OutArray, $Outline;
}
}
print OutFile @OutArray;
close(OutFile);

...and this works fairly well. ......而且效果相当好。 The problem is that if multiple variables appear on a line it will only grab the first variable. 问题是如果一行上出现多个变量,它只会获取第一个变量。 An example might be: 一个例子可能是:

$FirstVar = $SecondVar + $ThirdVar;

The script would only grab $FirstVar and output to a file. 该脚本只会获取$FirstVar并输出到文件。 This might still work though because $SecondVar and $ThirdVar have to be initialized somewhere else before the proceeding line has any meaning. 这可能仍然有效,因为在诉讼行有任何意义之前,必须在其他地方初始化$SecondVar$ThirdVar I guess the exception to the rule would be a line in which multiple variables are initialized at the same time. 我想规则的例外是一条线,其中多个变量同时被初始化。

Could an example in real Perl code break this script? 真正的Perl代码中的示例是否会破坏此脚本? Also, how to grab multiple items that match my regular expression's criteria from the same line? 另外,如何从同一行中获取符合我的正则表达式条件的多个项目?

Don't do that 不要那样做

You can't really parse Perl with regexes, so I wouldn't even try. 你无法用正则表达式解析Perl,所以我甚至都不会尝试。
You can't even properly parse it without actually running it, but you can get close with PPI . 如果不实际运行它,你甚至无法正确解析它,但你可以接近PPI

perl-variables.pl perl-variables.pl

#! /usr/bin/env perl
use strict;
use warnings;
use 5.10.1;

use PPI;
use PPI::Find;

my($filename) = (@ARGV, $0); # checks itself by default

my $Doc = PPI::Document->new($filename);
my $Find = PPI::Find->new( sub{
  return 0 unless $_[0]->isa('PPI::Token::Symbol');
  return 1;
});

$Find->start($Doc);
while( my $symbol = $Find->match ){
  my $raw = $symbol->content;
  my $var = $symbol->symbol;
  if( $raw eq $var ){
    say $var;
  } else {
    say "$var\t($raw)";
  }
}
print "\n";

my @found = $Find->in($Doc);
my %found;
$found{$_}++ for @found;

say for sort keys %found;

Running it against itself, produces: 针对自身运行,产生:

$filename
@ARGV
$0
$Doc
$filename
$Find
@_  ($_)
$Find
$Doc
$symbol
$Find
$raw
$symbol
$var
$symbol
$raw
$var
$var
@found
$Find
$Doc
%found
%found  ($found)
$_
@found
%found

$0
$Doc
$Find
$_
$filename
$found
$raw
$symbol
$var
%found
@ARGV
@found

It looks like this will miss fully qualified variable names ( $My::Package::Foo ) and the rare but valid variable names enclosed with braces ( ${variable} , ${"varname!with#special+chars"} ). 看起来这将缺少完全限定的变量名( $My::Package::Foo )和用大括号括起来的稀有但有效的变量名( ${variable}${"varname!with#special+chars"} )。 Your script will also match element accesses of hashes and arrays ( $array[4] ==> $array , $hash{$key} ==> $hash ), and object method calls ( $object->method() ==> $object ), which may or may not be what you want. 您的脚本还将匹配哈希和数组的元素访问( $array[4] ==> $array$hash{$key} ==> $hash )和对象方法调用( $object->method() == > $object ),可能是也可能不是你想要的。

You also mismatch variables with underscores ( $my_var ) and numbers ( $var3 ), and you could get false positives from comments, quoted strings, pod, etc. ( # report bugs to bob@company.org ). 您还可以使用下划线( $my_var )和数字( $var3 )使变量不匹配,并且您可能会从评论,引用字符串,pod等中获得误报( $my_var # report bugs to bob@company.org )。

Matching multiple expressions is a matter of using the /g modifier, which will return a list of matches: 匹配多个表达式是使用/g修饰符的问题,它将返回匹配列表:

@vars = $Testline =~ /[@\$]\w+/gi;
if (@vars > 0) {
  push @OutArray, @vars;
}

Time simple-minded answer is to the /g flag on your regexp. 时间简单的回答是你正则表达式上的/ g标志。

The complex answer is that this sort of code analysis is very difficult for perl. 复杂的答案是这种代码分析对于perl来说非常困难。 Look at the module PPI for a better, more full featured, semantic analysis of perl code. 查看模块PPI,以获得更好,更全面的perl代码语义分析。

I can't answer either of your questions directly, but I will offer this: I don't know why you're trying to extract scalars, but the debugger package that comes with perl has to "know" about all variables, and the last time I looked it was written in Perl. 我不能直接回答你的任何问题,但我会提供:我不知道为什么你要提取标量,但是perl附带的调试包必须“知道”所有变量,并且我上次看的它是用Perl编写的。 You may be better off trying to evaluate a perl script using the debugger package or techniques borrowed from that package rather than reinventing the wheel. 您可能最好尝试使用调试器包或从该包借用的技术来评估perl脚本,而不是重新发明轮子。

Despite the limitations with the method, here is a slightly simpler version of the script above that reads from stdin. 尽管该方法存在局限性,但这里有一个稍微简单的上述脚本版本,它从stdin读取。

#!/usr/bin/perl
use strict;
use warnings;
my %vars;

while (<>) {
  $vars{$_}++ for (m'([$@]\w+)'g);
}

my @vars = keys %vars;
print "@vars\n";

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM