简体   繁体   中英

Why does my Perl script remove characters from the file?

I have some issue with a Perl script. It modifies the content of a file, then reopen it to write it, and in the process some characters are lost. All words starting with '%' are deleted from the file. That's pretty annoying because the % expressions are variable placeholders for dialog boxes.

Do you have any idea why? Source file is an XML with default encoding

Here is the code:

undef $/;
open F, $file or die "cannot open file $file\n";
my $content = <F>;                                           
close F;                                                     
                                                               
$content =~s{status=["'][\w ]*["']\s*}{}gi;
                 
printf $content;

open F, ">$file" or die "cannot reopen $file\n";             
printf F $content;                                           
close F or die "cannot close file $file\n";

You're using printf there and it thinks its first argument is a format string. See the printf documentation for details. When I run into this sort of problem, I always ensure that I'm using the functions correctly. :)

You probably want just print :

 print FILE $content;

In your example, you don't need to read in the entire file since your substitution does not cross lines. Instead of trying to read and write to the same filename all at once, use a temporary file:

open my($in),  "<", $file       or die "cannot open file $file\n";
open my($out), ">", "$file.bak" or die "cannot open file $file.bak\n";

while( <$in> )
    {
    s{status=["'][\w ]*["']\s*}{}gi;
    print $out;
    }

rename "$file.bak", $file or die "Could not rename file\n";

This also reduces to this command-line program:

% perl -pi.bak -e 's{status=["\']\\w ]*["\']\\s*}{}g' file

Er. You're using printf.

printf interprets "%" as something special.

use "print" instead.

If you have to use printf, use

printf "%s", $content;

Important Note:

PrintF stands for Print Format , just as it does in C.

fprintf is the equivelant in C for File IO.

Perl is not C.

And even IN C, putting your content as parameter 1 gets you shot for security reasons.

Or even

perl -i bak -pe 's{status=["\'][\w ]*["\']\s*}{}gi;' yourfiles

-e says "there's code following for you to run"

-i bak says "rename the old file to whatever.bak"

-p adds a read-print loop around the -e code

Perl one-liners are a powerful tool and can save you a lot of drudgery.

如果您想要一个了解文档 XML 性质的解决方案(即,仅删除状态属性,而不匹配文本内容),您还可以使用XML::PYX

$ pyx doc.xml | perl -ne'print unless /^Astatus/' | pyxw

That's because you used printf instead of print and you know printf doesn't print "%" (because it would think you forgot to type the format symbol such as %s, %f etc) unless you explicitly mention by "%%". :-)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM