简体   繁体   中英

Using awk to extract specific line from all text files in a directory

I have a folder with 50 text files and I want to extract the first line from each of them at the command line and output this to a result.txt file.

I'm using the following command within the directory that contains the files I'm working with:

for files in *; do awk '{if(NR==1) print NR, $0}' *.txt; done > result.txt

When I run the command, the result.txt file contains 50 lines but they're all from a single file in the directory rather than one line per file. The common appears to be looping over a single 50 times rather than over each of the 50 files.

I'd be grateful if someone could help me understand where I'm going wrong with this.

try this -

for i in *.txt;do head -1 $i;done > result.txt

OR

for files in *.txt;do awk 'NR==1 {print $0}'  $i;done > result.txt

Your code has two problems:

  1. You have an outer loop that iterates over * , but your loop body doesn't use $files . That is, you're invoking awk '...' *.txt 50 times. This is why any output from awk is repeated 50 times in result.txt .

  2. Your awk code checks NR (the number of lines read so far), not FNR (the number of lines read within the current file ). NR==1 is true only at the beginning of the very first file.

There's another problem: result.txt is created first, so it is included among *.txt . To avoid this, give it a different name (one that doesn't end in .txt ) or put it in a different directory.

A possible fix:

awk 'FNR==1 {print NR, $0}' *.txt > result

Why not use head? For example with find:

find midir/ -type f -exec head -1 {} \; >> result.txt

If you want to follow your approach you need to specify the file and not use the wildcard with awk:

for files in *; do awk '{if(NR==1) print NR, $0}' "$files"; done > result.txt

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM