[英]Using Gawk and Printf in a Bash script
I am trying to separate a file into smaller files with gawk and rename the smaller files in order from the original file. 我正在尝试使用gawk将文件分成较小的文件,然后按顺序从原始文件重命名较小的文件。
for i in *.txt
do
gawk -v RS="START_of_LINE_to_SEPARATE" 'NF{ print RS$0 > "new_file_"++n".txt"}' $i
done
The output gives me: new_file_1.txt new_file_2.txt ect... 输出给我:new_file_1.txt new_file_2.txt等...
I would like the output to be: new_file_0001.txt new_file_0002.txt ect... 我希望输出为:new_file_0001.txt new_file_0002.txt ect ...
You can do: 你可以做:
for i in *.txt; do
printf -v num "%04d" $((++n))
gawk -v num="$num" -v RS="START_of_LINE_to_SEPARATE" 'NF{
print RS$0 > "new_file_" num ".txt"}' "$i"
done
Ignoring the issue of the outer loop and focusing on the awk part of the question, you can use sprintf
to produce your filename: 忽略外部循环的问题,而专注于问题的awk部分,可以使用
sprintf
生成文件名:
gawk -v RS="START_of_LINE_to_SEPARATE" 'NF{ file = sprintf("new_file_%04d.txt", ++n)
print RS$0 > file }' "$i"
The format specifier %04d
means that the number is a digit, padded to length 4 with leading zeros. 格式说明符
%04d
表示该数字是数字,并用前导零填充长度4。
If you want to go through all the .txt files and keep incrementing the counter, then you can get rid of the loop and pass them all to awk at once by changing "$i"
to *.txt
. 如果要遍历所有.txt文件并继续增加计数器,则可以摆脱循环,将
"$i"
更改为*.txt
即可将它们全部传递给awk。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.