简体   繁体   中英

Using a regular expression in Perl to list variables from another Perl script

My thoughts on how to grab all scalars and arrays out of a Perl file went along the lines of:

open (InFile, "SomeScript.pl");
@InArray = <InFile>;
@OutArray = {};
close (InFile);
$ArrayCount = @InArray;
open (OutFile, ">outfile.txt");
for ($x=0; $x<=$ArrayCount; $x++){
$Testline = @InArray[$x];

if($Testline =~ m/((@|\$)[A-Z]+)/i){
    $Outline = "$1\n";  
    push @OutArray, $Outline;
}
}
print OutFile @OutArray;
close(OutFile);

...and this works fairly well. The problem is that if multiple variables appear on a line it will only grab the first variable. An example might be:

$FirstVar = $SecondVar + $ThirdVar;

The script would only grab $FirstVar and output to a file. This might still work though because $SecondVar and $ThirdVar have to be initialized somewhere else before the proceeding line has any meaning. I guess the exception to the rule would be a line in which multiple variables are initialized at the same time.

Could an example in real Perl code break this script? Also, how to grab multiple items that match my regular expression's criteria from the same line?

Don't do that

You can't really parse Perl with regexes, so I wouldn't even try.
You can't even properly parse it without actually running it, but you can get close with PPI .

perl-variables.pl

#! /usr/bin/env perl
use strict;
use warnings;
use 5.10.1;

use PPI;
use PPI::Find;

my($filename) = (@ARGV, $0); # checks itself by default

my $Doc = PPI::Document->new($filename);
my $Find = PPI::Find->new( sub{
  return 0 unless $_[0]->isa('PPI::Token::Symbol');
  return 1;
});

$Find->start($Doc);
while( my $symbol = $Find->match ){
  my $raw = $symbol->content;
  my $var = $symbol->symbol;
  if( $raw eq $var ){
    say $var;
  } else {
    say "$var\t($raw)";
  }
}
print "\n";

my @found = $Find->in($Doc);
my %found;
$found{$_}++ for @found;

say for sort keys %found;

Running it against itself, produces:

$filename
@ARGV
$0
$Doc
$filename
$Find
@_  ($_)
$Find
$Doc
$symbol
$Find
$raw
$symbol
$var
$symbol
$raw
$var
$var
@found
$Find
$Doc
%found
%found  ($found)
$_
@found
%found

$0
$Doc
$Find
$_
$filename
$found
$raw
$symbol
$var
%found
@ARGV
@found

It looks like this will miss fully qualified variable names ( $My::Package::Foo ) and the rare but valid variable names enclosed with braces ( ${variable} , ${"varname!with#special+chars"} ). Your script will also match element accesses of hashes and arrays ( $array[4] ==> $array , $hash{$key} ==> $hash ), and object method calls ( $object->method() ==> $object ), which may or may not be what you want.

You also mismatch variables with underscores ( $my_var ) and numbers ( $var3 ), and you could get false positives from comments, quoted strings, pod, etc. ( # report bugs to bob@company.org ).

Matching multiple expressions is a matter of using the /g modifier, which will return a list of matches:

@vars = $Testline =~ /[@\$]\w+/gi;
if (@vars > 0) {
  push @OutArray, @vars;
}

Time simple-minded answer is to the /g flag on your regexp.

The complex answer is that this sort of code analysis is very difficult for perl. Look at the module PPI for a better, more full featured, semantic analysis of perl code.

I can't answer either of your questions directly, but I will offer this: I don't know why you're trying to extract scalars, but the debugger package that comes with perl has to "know" about all variables, and the last time I looked it was written in Perl. You may be better off trying to evaluate a perl script using the debugger package or techniques borrowed from that package rather than reinventing the wheel.

Despite the limitations with the method, here is a slightly simpler version of the script above that reads from stdin.

#!/usr/bin/perl
use strict;
use warnings;
my %vars;

while (<>) {
  $vars{$_}++ for (m'([$@]\w+)'g);
}

my @vars = keys %vars;
print "@vars\n";

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM