简体   繁体   English

复合打印语句覆盖部分变量

[英]Compound print statement overwrites part of variable

I have some very bizarre behavior in a script that I wrote and have used for years but, for some reason, fails to run on one particular file.我编写并使用了多年的脚本中有一些非常奇怪的行为,但由于某种原因,无法在一个特定文件上运行。

Recognizing that the script is failing to identify a key that should be in a hash, I added some test print statements to read the keys.认识到脚本无法识别应该在 hash 中的密钥,我添加了一些测试打印语句来读取密钥。 My normal strategy involves placing asterisks before and after the variable to detect potential hidden characters.我的常规策略是在变量前后放置星号以检测潜在的隐藏字符。 Clearly, the keys are corrupt.显然,密钥已损坏。 Relevant code block:相关代码块:

foreach my $fastaRecord (@GenomeList) {

    my ($ID, $Seq) = split(/\n/, $fastaRecord, 2);

# uncomment next line to strip everything off sequence
# header except trailing numeric identifiers
#    $ID =~ s/.+?(\d+$)/$1/;

    $Seq =~ s/[^A-Za-z-]//g; # remove any kind of new line characters

    $RefSeqLen = length($Seq);

    $GenomeLenHash{$ID} = $RefSeqLen;

    print "$ID\n";

    print "*$ID**\n";

}

This produces the following output:这会产生以下 output:

supercont3
**upercont3
Mitochondrion
**itochondrion
Chr1
**hr1
Chr2
**hr2
Chr3
**hr3
Chr4
**hr4

Normally, I'd suspect "illegal" newline characters as being involved.通常,我会怀疑涉及“非法”换行符。 However, I manually replaced all newlines in the input file to try and solve the problem.但是,我手动替换了输入文件中的所有换行符以尝试解决问题。 What in the input file could be causing the script to execute in this way?输入文件中的什么可能导致脚本以这种方式执行? I could imagine that maybe, despite my efforts, there is still an illegal newline after the ID variable, but then why are neither the first asterisk, nor newline characters after the double asterisk not printed, and why is the double asterisk printed at the beginning of the line in a way that overwrites the first asterisk as well as the first two characters of the variable "value"?我可以想象,尽管我努力了,但 ID 变量后面仍然有一个非法换行符,但是为什么第一个星号和双星号之后的换行符都没有打印出来,为什么双星号打印在开头该行以覆盖第一个星号以及变量“值”的前两个字符的方式?

When you see these sorts of effects, look at the data in a file or in a hexdump.当您看到这些效果时,请查看文件或 hexdump 中的数据。 The terminal is going to hide data if it interprets backspace, carriage returns, and ansi sequences.如果终端解释退格键、回车符和 ansi 序列,它将隐藏数据。

% perl script.pl | hexdump -C

Here's a simple example.这是一个简单的例子。 I echo a , b , carriage return, then c .我回显ab ,回车,然后c My terminal sees the carriage return and moves the cursor to the beginning of the line.我的终端看到回车并将 cursor 移到该行的开头。 After that, the output continues.之后,output继续。 The c masks the a : c掩盖了a

% echo $'ab\rc'
cb

With a hex dump, I can see the 0d that represents the carriage return:通过十六进制转储,我可以看到代表回车的0d

% echo $'ab\rc' | hexdump -C
00000000  61 62 0d 63 0a                                    |ab.c.|
00000005

Also, when you try to remove "any sort of newline" from $Seq , you might just remove vertical whitespace:此外,当您尝试从$Seq中删除“任何类型的换行符”时,您可能只删除垂直空格:

$target =~ s/\v//g;

You might also use the generalized newline to您也可以使用通用换行符

$target =~ s/\R//g;

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM