简体   繁体   中英

Why my perl script isn't finding bad indetation from my regex match

My work's coding standard uses this bracket indentation:

some declaration
    {
    stuff = other stuff;
    };

control structure, function, etc()
    {
    more stuff;
    for(some amount of time)
        {
        do something;
        }
    more and more stuff;
    }

I'm writing a perl script to detect incorrect indentation. Here's what I have in the body of a while(<some-file-handle>) :

# $prev holds the previous line in the file
# $current holds the current in the file
if($prev =~ /^(\t*)[^;]+$/ and $current =~ /^(?<=!$1\t)[\{\}].+$/) {
    print "$file @ line ${.}: Bracket indentation incorrect\n";
}

Here, I'm trying to match:

  • $prev : A line not ended with a semi-colon, followed by...
  • $current : A line not having the number of leading tabs+1 of the previous line.

This doesn't seem to match anything, at the moment.

the $prev variable needs some modification.

it should be something like \t* then .+ then not ending in semicolon

also, the $current should be like:

anything ending in ;or { or } not having the number of leading tabs+1 of the previous line.

EDIT the perl code to try the $prev

#!/usr/bin/perl -l

open(FP,"example.cpp");

while(<FP>)
{
  if($_ =~ /^(\t*)[^;]+$/) {
    print "got the line: $_";
  }
}

close(FP);

//example.cpp

for(int i = 0;i<10;i++)
{
  //not this;
  //but this
}

//output

got the line: {

got the line:   //but this

got the line: }

it did not detect the line with the for loop... am i missing something...

And you intend to only count tabs (not spaces) for indentation?

Writing this kind of checker is complicated. Just think about all the possible constructs that uses braces that should not change indentation:

s{some}{thing}g

qw{ a b c }

grep { defined } @a

print "This is just a { provided to confuse";

print <<END;
This {
  $is = not $code
}
END

But anyway, if the issues above aren't important to you, consider whether the semi colon is important at all in your regex. After all, writing

while($ok)
    {
    sort { some_op($_) }
        grep { check($_} }
        my_func(
            map { $_->[0] } @list
        );
    }

Should be possible.

i see a couple of problems...

  1. your prev regex matches all lines which do not have a;anywhere. which will break on lines like (for int x = 1; x < 10; x++)
  2. if the indent of the opening { is incorrect, you will not detect that.

try this instead, it only cares if you have a;{ (followed by any whitespace) at the end .

/^(\s*).*[^{;]\s*$/

now you should change your strategy so that if you see a line which does not end in { or; you increment the indent counter.

if you see a line which ends in }; or } decrement your indent counter.

compare all lines against this

/^\t{$counter}[^\s]/ 

so...

$counter = 0;

if (!($curr =~ /^\t{$counter}[^\s]/)) {
    # error detected
}

if ($curr =~ /[};]+/) {
  $counter--;

} else if ($curr =~ /^(\s*).*[^{;]\s*$/) }
  $counter++;

}

sorry for not styling my code according to your standards... :)

Have you considered looking at Perltidy ?

Perltidy is a Perl script that reformats Perl code into set standards. Granted, what you have isn't part of the Perl standard, but you can probably tweak the curly braces via the configuration file Perltidy uses. If all else fails, you can hack through the code. After all, Perltidy is just a Perl script.

I haven't really used it, but it might be worth looking into. Your problem is trying to locate all the various edge cases, and making sure you're handling them correctly. You can parse 100 programs to find that the 101st reveal problems in your formatter. Perltidy has been used by thousands of people on millions of lines of code. If there is an issue, it probably already has been found.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM