[英]Combining multiple groups of text files with bash
I have a folder of text files that are labelled something like this: 我有一个文本文件文件夹,其标签为:
0filename1
1filename1
2filename1
....
0filename2
1filename2
2filename2
....
et cetera. 等等。 What I want to do is take all the files that end in filename1 and combine them all into a file named filename1, and similarly for filename2 and all other files. 我想要做的是将所有以filename1结尾的文件合并到一个名为filename1的文件中,对于filename2和所有其他文件也是如此。 Normally I would do something like this 通常我会做这样的事情
cat [0123456789]*filename1 > filename1
and just repeat the command for every different file name I have. 然后为我拥有的每个不同文件名重复该命令。 However, I want to be able to automate this. 但是,我希望能够实现此自动化。 The exact form of the file names change regularly, so it's not as simple as just writing a script that will do the above command for filename1, filename2, etc. The length of the file names do stay constant though, so I suspect the right way to automate this would be for a script to take every file that has the same last n characters in the filename and copy them into a file with the name of these n characters. 文件名的确切形式会定期更改,因此它不像编写脚本来对filename1,filename2等执行上述命令那样简单。但是文件名的长度确实保持不变,因此我怀疑正确的方法要使之自动化,脚本将采用每个文件名中最后n个字符相同的文件,然后将其复制到文件中这n个字符的名称中。 I'm not sure how to do this though - any suggestions? 我不确定如何执行此操作-有什么建议吗?
Sounds pretty simple, just need to filter the files to get the 'base' strings. 听起来很简单,只需要过滤文件即可获得“基本”字符串。
for $base in $( ls | cut -b 1-8 | sort -u ); do
cat [0123456789]*$base > $base
done
where 1-8
is the characters you intend to keep , so <consistent length of filenames> - <N last characters that vary instead of 8
. 其中1-8
是您要保留的字符,因此<文件名的一致长度>-<N个最后一个字符,而不是8
。
Bit more complex solution that handles files with whitespace characters, with multi-digit numbers or flexible filename length: 更复杂的解决方案,用于处理带有空格字符,多位数数字或灵活的文件名长度的文件:
#!/usr/bin/env bash
shopt -s extglob nullglob
files=(+([0-9])?*)
(( ${#files[@]} )) || exit 1
while IFS= read -rd '' filename; do
printf '%s\0' +([0-9])"$filename" | sort -zn | xargs -0 cat > "$filename"
done < <(printf '%s\0' "${files[@]##+([0-9])}" | sort -zu)
#!/bin/bash
str="filename"
for i in {1..2}
do
cat {?,??}"${str}${i}" > "${str}${i}"
done
Script uses Bash Expansion {..}
and character wildcard ?
脚本使用Bash扩展{..}
和字符通配符?
to expand the available filenames. 扩展可用的文件名。 If you have 0filename1
to 9filename1
, then use a single ?
如果您有0filename1
到9filename1
,则使用一个?
and use ??
并使用??
for 10filename1-99filename1
. 对于10filename1-99filename1
。
Example: 例:
$ cat 0filename1
011
$ cat 1filename1
111
$ cat 2filename1
211
$ cat 10filename1
1011
$ cat 0filename2
022
$ cat 1filename2
122
$ cat 2filename2
222
$ cat 10filename2
1022
Output of the above script would be: 上面脚本的输出为:
$ cat filename1
011
111
211
1011
$ cat filename2
022
122
222
1022
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.