简体   繁体   English

Perl 从标准输入输入一个字符

[英]Perl do input one char from stdin

How can Perl do input from stdin , one char like Perl 如何从stdin ,一个字符

readline -N1

does?做?

You can do that with the base perl distribution, no need to install extra packages:您可以使用基本 perl 发行版来做到这一点,无需安装额外的软件包:

use strict;
sub IO::Handle::icanon {
        my ($fh, $on) = @_;
        use POSIX;
        my $ts = new POSIX::Termios;
        $ts->getattr(fileno $fh) or die "tcgetattr: $!";
        my $f = $ts->getlflag;
        $ts->setlflag($on ? $f | ICANON : $f & ~ICANON);
        $ts->setattr(fileno $fh) or die "tcsetattr: $!";
}

# usage example
# a key like `Left` or `á` may generate multiple bytes
STDIN->icanon(0);
sysread STDIN, my $c, 256;
STDIN->icanon(1);
# the read key is in $c

Reading just one byte may not be a good idea because it will just leave garbage to be read later when pressing a key like Left or F1 .只读取一个字节可能不是一个好主意,因为当按下LeftF1之类的键时,它只会留下待读取的垃圾。 But you can replace the 256 with 1 if you want just that, no matter what.但是,无论如何,您都可以将256替换为1

<STDIN> will read stdin one byte (C char type, which is not the same as a character which these days are typically made of several bytes except for those in the US-ASCII charset) at a time from stdin if the record separator is set to a reference to the number 1. < <STDIN>如果记录分隔char为设置为对数字 1 的引用。

$ echo perl | perl -le '$/ = \1; $a = <STDIN>; print "<$a>"'
<p>

Note that underneath, it may read (consume) more than one byte from the input.请注意,在下面,它可能从输入中读取(消耗)超过一个字节。 Above, the next <STDIN> within perl would return <e> , but possibly from some large buffer that was read beforehand.上面, perl中的下一个<STDIN>将返回<e> ,但可能来自预先读取的一些大缓冲区。

$ echo perl | (perl -le '$/ = \1; $a = <STDIN>; print "<$a>"'; wc -c)
<p>
0

Above, you'll notice that wc didn't receive any input as it had all already been consumed by perl .在上面,您会注意到wc没有收到任何输入,因为它已经全部被perl消耗了。

$ echo perl | (PERLIO=raw perl -le '$/ = \1; $a = <STDIN>; print "<$a>"'; wc -c)
<p>
4

This time, wc got 4 bytes ( e , r , l , \n ) as we told perl to use raw I/O so the <STDIN> translates to a read(0, bud, 1) .这一次, wc得到了 4 个字节( erl\n ),因为我们告诉perl使用原始 I/O 所以<STDIN>转换为read(0, bud, 1)

Instead of <STDIN> , you can use perl 's read with the same caveat:代替<STDIN> ,您可以使用perlread ,但具有相同的警告:

$ echo perl | (perl -le 'read STDIN, $a, 1; print "<$a>"'; wc -c)
<p>
0
$ echo perl | (PERLIO=raw perl -le 'read STDIN, $a, 1; print "<$a>"'; wc -c)
<p>
4

Or use sysread which is the true wrapper for the raw read() :或者使用sysread ,它是原始read()的真正包装器:

$ echo perl | (perl -le 'sysread STDIN, $a, 1; print "<$a>"'; wc -c)
<p>
4

To read one character at a time, you need to read one byte at a time until the end of the character.要一次读取一个字符,您需要一次读取一个字节,直到字符结束。

You can do it for UTF-8 encoded input (in locales using that encoding) in perl with <STDIN> or read (not sysread ) with the -C option, including with raw PERLIO :您可以使用<STDIN>perl中对 UTF-8 编码输入(在使用该编码的语言环境中)执行此操作,或使用-C选项read (不是sysread ),包括使用raw PERLIO

$ echo été | (PERLIO=raw perl -C -le '$/ = \1; $a = <STDIN>; print "<$a>"'; wc -c)
<é>
4
$ echo été | (PERLIO=raw perl -C -le 'read STDIN, $a, 1; print "<$a>"'; wc -c)
<é>
4

With strace , you'd see perl does two read(0, buf, 1) system calls underneath to read that 2-byte é character.使用strace ,您会看到perl在下面执行两个read(0, buf, 1)系统调用来读取那个 2 字节的é字符。

Like with ksh93 / bash's read -N (or zsh's read -k ), you can get surprises if the input is not properly encoded in UTF-8:与 ksh93 / bash 的read -N (或 zsh 的read -k )一样,如果输入未在 UTF-8 中正确编码,您可能会感到惊讶:

$ printf '\375 12345678' | (PERLIO=raw perl -C -le 'read STDIN, $a, 1; print "<$a>"'; wc -c)
<� 1234>
4

\375 ( \xFD ) would normally be the first byte of the encoding of a 6 byte character in UTF-8¹, so perl reads all 6 bytes here even though the second to sixth can't possibly be part of that character as they don't have the 8 th bit set. \375\xFD )通常是 UTF-8 中 6 字节字符编码的第一个字节¹,因此perl会在此处读取所有 6 个字节,即使第二个到第六个不可能是该字符的一部分,因为它们没有'没有设置第 8

Note that when stdin is a tty device, read() will not return until the terminal at the other end sends a LF ( eol ), CR (which is by default converted to LF), or eof (usually ^D ) or eol2 (usually not defined) character as configured in the tty line discipline (like with the stty command) as the tty driver implements its own internal line editor allowing you to edit what you type before pressing enter.请注意,当 stdin 是 tty 设备时, read()将不会返回,直到另一端的终端发送 LF ( eol )、CR (默认转换为 LF) 或eof (通常为^D ) 或eol2 (通常未定义)在 tty 行规则中配置的字符(如使用stty命令),因为 tty 驱动程序实现其自己的内部行编辑器,允许您在按 Enter 之前编辑您键入的内容。

If you want to read the byte(s) that is(are) sent for each key pressed by the user there, you'd need to disable that line editor (which bash/ksh93's read -N or zsh 's read -k do when stdin is a tty), see @guest's answer for details on how to do that.如果您想读取用户在那里按下的每个键发送的字节,您需要禁用该行编辑器(bash/ksh93 的read -Nzshread -k当 stdin 是 tty 时),请参阅@guest 的答案以获取有关如何执行此操作的详细信息。


¹ While now Unicode restricts codepoints to up to 0x10FFFF which means UTF-8 encodings have at most 4 bytes, UTF-8 was originally designed to encode code points up to 0x7fffffff (up to 6 byte encoding) and perl extends it to up to 0x7FFFFFFFFFFFFFFF (13 byte encoding) ¹ While now Unicode restricts codepoints to up to 0x10FFFF which means UTF-8 encodings have at most 4 bytes, UTF-8 was originally designed to encode code points up to 0x7fffffff (up to 6 byte encoding) and perl extends it to up to 0x7FFFFFFFFFFFFFFF (13 byte encoding )

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM