简体   繁体   English

猫几千个档案

[英]Cat several thousand files

I have several(60,000) files in a folder that need to be combined into 3 separate files. 我的文件夹中有几个(60,000)个文件,需要合并为3个单独的文件。 How would I cat this so that I could have each file containing the contents of ~20,000 of these files? 我将如何处理这个问题,以便每个文件都包含约20,000个这些文件的内容?

I know it would be like a loop: 我知道这就像一个循环:

for i in {1..20000}
do
cat file-$i > new_file_part_1
done

Doing: 正在做:

cat file-$i > new_file_part_1

Will truncate new_file_part_1 every time the loop iterates. 每次循环迭代时都会截断new_file_part_1 You want to append to the file: 您要附加到文件:

cat file-$i >> new_file_part_1

The other answers close and open the file on every iteration. 其他答案在每次迭代时关闭并打开文件。 I would prefer 我会比较喜欢

for i in {1..20000}
do
    cat file-$i
done > new_file_part_1

so the output of all cat runs are piped into one file opend once for all. 因此,所有cat run的输出都通过管道传递到一个文件,该文件一次打开。

Assuming it doesn't matter which input file goes to which output file: 假设哪个输入文件转到哪个输出文件都没有关系:

for i in {1..60000} 
do 
  cat file$i >> out$(($i % 3))
done

This script uses the modulo operator % to divide the input into 3 bins; 该脚本使用模运算符%将输入分为3个bin; it will generate 3 output files: 它将生成3个输出文件:

  • out0 contains file3, file6, file9, ... out0包含file3,file6,file9,...
  • out1 contains file1, file4, file7, ... out1包含file1,file4,file7,...
  • out2 contains file2, file5, file8, ... out2包含file2,file5,file8,...
#!/bin/bash

cat file-{1..20000} > new_file_part_1

This launches cat only once and opens and closes the output file only once. 这只会启动cat一次,并且只会打开和关闭输出文件一次。 No loop required, since cat can accept all 20000 arguments. 不需要循环,因为cat可以接受所有20000个参数。

An astute observer noted that on some systems, the 20000 arguments may exceed the system's ARG_MAX limit. 一个精明的观察者指出,在某些系统上,20000自变量可能超出系统的ARG_MAX限制。 In such a case, xargs can be used, with the penalty that cat will be launched more than once (but still significantly fewer than 20000 times). 在这种情况下,可以使用xargs ,但这样做的代价是cat将被发射一次以上(但仍然大大少于20000次)。

echo file-{1..20000} | xargs cat > new_file_part_1

This works because, in Bash, echo is a shell built-in and as such is not subject to ARG_MAX . 之所以ARG_MAX ,是因为在Bash中, echo是内置的shell,因此不受ARG_MAX

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM