简体   繁体   English

Shell如何为awk生成输入

[英]How does the shell generate input for awk

Say I have a file1 containing: 说我有一个file1包含:

1,2,3,4

I can use awk to process that file like this; 我可以使用awk这样处理该文件;

awk -v FS="," '{print $1}' file1

Also I can invoke awk with a Here String, meaning I read from stdin: 我也可以使用Here String调用awk,这意味着我从stdin中读取:

awk -v FS="," '{print $1}' <<<"9,10,11,12"

Command 1 yields the result 1 and command 2 yields 9 as expected. 命令1产生结果1 ,命令2产生9。

Now say I have a second file2: 现在说我有第二个文件2:

4,5

If I parse both files with awk sequentally: 如果我依次用awk解析两个文件:

awk -v FS="," '{print $1}' file1 file2

I get: 我得到:

1
4

as expected. 如预期的那样。

But if I'm mixing reading from stdin and reading from files, the content I'm reading from stdin gets ignored and only the content in the files get processed sequentially: 但是,如果我混合了从标准输入中读取从文件中读取,则我从标准输入中读取的内容将被忽略,并且只有文件中的内容会被顺序处理:

awk -v FS="," '{print $1}' file1 file2 <<<"9,10,11,12"
awk -v FS="," '{print $1}' file1 <<<"9,10,11,12" file2
awk -v FS="," '{print $1}' <<<"9,10,11,12" file1 file2

All three commands yield: 所有这三个命令都会产生:

1
4

which means the content from stdin simply gets thrown away. 这意味着stdin中的内容只会被丢弃。 Now what is the shell doing? 现在,壳在做什么?

Interestingly if I change command 3 to: 有趣的是,如果我将命令3更改为:

awk -v FS="," '{print $1}' <<<"9,10,11,12",file1,file2

I simply get 9 , which makes sense, as file1/2 are just two more fields from stdin. 我只是得到9 ,这很有意义,因为file1 / 2只是stdin的另外两个字段。 But why is then 但是为什么呢

awk -v FS="," '{print $1}' <<<"9,10,11,12" file1 file2

not expanded to 没有扩展到

awk -v FS="," '{print $1}' <<<"9,10,11,12 file1 file2"

which would also yield the result 9 ? 这也将产生结果9

And why does the content from stdin gets ignored? 为什么标准输入中的内容会被忽略? The same question arises for command 1 and 2. What is the shell doing here? 命令1和2也会出现相同的问题。shell在这里做什么?

I tried out the commands on: GNU bash, version 4.2.53(1)-release 我尝试了以下命令:GNU bash,版本4.2.53(1)-release

Standard input and input from files don't mix together well. 标准输入和来自文件的输入不能很好地混合在一起。 This behavior is not exclusive to awk , you will find it in a lot of command line applications. 此行为不是awk独有的,您可以在许多命令行应用程序中找到它。 It is logical if you think of it like this: 如果您这样认为,这是合乎逻辑的:

Files need to be processed one by one. 文件需要一一处理。 The consuming application does not have control over when the input behind STDIN starts and stops. 消费应用程序无法控制STDIN后面的输入何时开始和停止。 Look at echo a,b,c | awk -F, '{print $1}' file1 file2 echo a,b,c | awk -F, '{print $1}' file1 file2 echo a,b,c | awk -F, '{print $1}' file1 file2 . echo a,b,c | awk -F, '{print $1}' file1 file2 In what order do the incoming "files" need to be read? 需要以什么顺序读取传入的“文件”? When If you think about when FNR would need to be reset, or what FILENAME should be, it becomes clear that it is hard to make this right. 当您考虑何时需要重新设置FNR或应该重新设置FILENAME ,很明显很难做到这一点。

One trick that you can play, is to let awk (or any other program) read from a file descriptor generated by the shell. 您可以玩的一个技巧是让awk (或任何其他程序)从shell生成的文件描述符中读取。 awk -F, '{print $1}' file1 <(echo 4,5,6) file2 will do what you expected in the first place. awk -F, '{print $1}' file1 <(echo 4,5,6) file2将实现您的期望。

What happens here, is that a proper file descriptor is created with the <(...) syntax (say: /proc/self/fd/11 ), and the reading program can treat it just like a file. 这里发生的是,使用<(...)语法(例如: /proc/self/fd/11 )创建了正确的文件描述符,并且读取程序可以将其视为文件。 It is the second argument, so it is the second file. 它是第二个参数,因此它是第二个文件。 FNR and FILENAME are all clear what they should be. FNRFILENAME都清楚它们应该是什么。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM