[英]Perl do input one char from stdin
How can Perl do input from stdin
, one char like Perl 如何从stdin
,一个字符
readline -N1
does?做?
You can do that with the base perl distribution, no need to install extra packages:您可以使用基本 perl 发行版来做到这一点,无需安装额外的软件包:
use strict;
sub IO::Handle::icanon {
my ($fh, $on) = @_;
use POSIX;
my $ts = new POSIX::Termios;
$ts->getattr(fileno $fh) or die "tcgetattr: $!";
my $f = $ts->getlflag;
$ts->setlflag($on ? $f | ICANON : $f & ~ICANON);
$ts->setattr(fileno $fh) or die "tcsetattr: $!";
}
# usage example
# a key like `Left` or `á` may generate multiple bytes
STDIN->icanon(0);
sysread STDIN, my $c, 256;
STDIN->icanon(1);
# the read key is in $c
Reading just one byte may not be a good idea because it will just leave garbage to be read later when pressing a key like Left
or F1
.只读取一个字节可能不是一个好主意,因为当按下Left
或F1
之类的键时,它只会留下待读取的垃圾。 But you can replace the 256
with 1
if you want just that, no matter what.但是,无论如何,您都可以将256
替换为1
。
<STDIN>
will read stdin one byte (C char
type, which is not the same as a character which these days are typically made of several bytes except for those in the US-ASCII charset) at a time from stdin if the record separator is set to a reference to the number 1. < <STDIN>
如果记录分隔char
为设置为对数字 1 的引用。
$ echo perl | perl -le '$/ = \1; $a = <STDIN>; print "<$a>"'
<p>
Note that underneath, it may read (consume) more than one byte from the input.请注意,在下面,它可能从输入中读取(消耗)超过一个字节。 Above, the next <STDIN>
within perl
would return <e>
, but possibly from some large buffer that was read beforehand.上面, perl
中的下一个<STDIN>
将返回<e>
,但可能来自预先读取的一些大缓冲区。
$ echo perl | (perl -le '$/ = \1; $a = <STDIN>; print "<$a>"'; wc -c)
<p>
0
Above, you'll notice that wc
didn't receive any input as it had all already been consumed by perl
.在上面,您会注意到wc
没有收到任何输入,因为它已经全部被perl
消耗了。
$ echo perl | (PERLIO=raw perl -le '$/ = \1; $a = <STDIN>; print "<$a>"'; wc -c)
<p>
4
This time, wc
got 4 bytes ( e
, r
, l
, \n
) as we told perl
to use raw I/O so the <STDIN>
translates to a read(0, bud, 1)
.这一次, wc
得到了 4 个字节( e
, r
, l
, \n
),因为我们告诉perl
使用原始 I/O 所以<STDIN>
转换为read(0, bud, 1)
。
Instead of <STDIN>
, you can use perl
's read
with the same caveat:代替<STDIN>
,您可以使用perl
的read
,但具有相同的警告:
$ echo perl | (perl -le 'read STDIN, $a, 1; print "<$a>"'; wc -c)
<p>
0
$ echo perl | (PERLIO=raw perl -le 'read STDIN, $a, 1; print "<$a>"'; wc -c)
<p>
4
Or use sysread
which is the true wrapper for the raw read()
:或者使用sysread
,它是原始read()
的真正包装器:
$ echo perl | (perl -le 'sysread STDIN, $a, 1; print "<$a>"'; wc -c)
<p>
4
To read one character at a time, you need to read one byte at a time until the end of the character.要一次读取一个字符,您需要一次读取一个字节,直到字符结束。
You can do it for UTF-8 encoded input (in locales using that encoding) in perl
with <STDIN>
or read
(not sysread
) with the -C
option, including with raw
PERLIO
:您可以使用<STDIN>
在perl
中对 UTF-8 编码输入(在使用该编码的语言环境中)执行此操作,或使用-C
选项read
(不是sysread
),包括使用raw
PERLIO
:
$ echo été | (PERLIO=raw perl -C -le '$/ = \1; $a = <STDIN>; print "<$a>"'; wc -c)
<é>
4
$ echo été | (PERLIO=raw perl -C -le 'read STDIN, $a, 1; print "<$a>"'; wc -c)
<é>
4
With strace
, you'd see perl
does two read(0, buf, 1)
system calls underneath to read that 2-byte é
character.使用strace
,您会看到perl
在下面执行两个read(0, buf, 1)
系统调用来读取那个 2 字节的é
字符。
Like with ksh93 / bash's read -N
(or zsh's read -k
), you can get surprises if the input is not properly encoded in UTF-8:与 ksh93 / bash 的read -N
(或 zsh 的read -k
)一样,如果输入未在 UTF-8 中正确编码,您可能会感到惊讶:
$ printf '\375 12345678' | (PERLIO=raw perl -C -le 'read STDIN, $a, 1; print "<$a>"'; wc -c)
<� 1234>
4
\375
( \xFD
) would normally be the first byte of the encoding of a 6 byte character in UTF-8¹, so perl
reads all 6 bytes here even though the second to sixth can't possibly be part of that character as they don't have the 8 th bit set. \375
( \xFD
)通常是 UTF-8 中 6 字节字符编码的第一个字节¹,因此perl
会在此处读取所有 6 个字节,即使第二个到第六个不可能是该字符的一部分,因为它们没有'没有设置第 8位。
Note that when stdin is a tty device, read()
will not return until the terminal at the other end sends a LF ( eol
), CR (which is by default converted to LF), or eof
(usually ^D
) or eol2
(usually not defined) character as configured in the tty line discipline (like with the stty
command) as the tty driver implements its own internal line editor allowing you to edit what you type before pressing enter.请注意,当 stdin 是 tty 设备时, read()
将不会返回,直到另一端的终端发送 LF ( eol
)、CR (默认转换为 LF) 或eof
(通常为^D
) 或eol2
(通常未定义)在 tty 行规则中配置的字符(如使用stty
命令),因为 tty 驱动程序实现其自己的内部行编辑器,允许您在按 Enter 之前编辑您键入的内容。
If you want to read the byte(s) that is(are) sent for each key pressed by the user there, you'd need to disable that line editor (which bash/ksh93's read -N
or zsh
's read -k
do when stdin is a tty), see @guest's answer for details on how to do that.如果您想读取用户在那里按下的每个键发送的字节,您需要禁用该行编辑器(bash/ksh93 的read -N
或zsh
的read -k
当 stdin 是 tty 时),请参阅@guest 的答案以获取有关如何执行此操作的详细信息。
¹ While now Unicode restricts codepoints to up to 0x10FFFF which means UTF-8 encodings have at most 4 bytes, UTF-8 was originally designed to encode code points up to 0x7fffffff (up to 6 byte encoding) and perl
extends it to up to 0x7FFFFFFFFFFFFFFF (13 byte encoding) ¹ While now Unicode restricts codepoints to up to 0x10FFFF which means UTF-8 encodings have at most 4 bytes, UTF-8 was originally designed to encode code points up to 0x7fffffff (up to 6 byte encoding) and perl
extends it to up to 0x7FFFFFFFFFFFFFFF (13 byte encoding )
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.