[英]Split files by line content
I have a file with the following content. 我有一个包含以下内容的文件。
aaaa
bbbb
cccc
1111
qqqq
1111
aaaa
dddd
Split into multiple small files with 1111 as a separator.The method I tried is as follows. 用1111作为分隔符将其拆分为多个小文件。我尝试的方法如下。
#!/bin/bash
i=0
while read line
do
if [[ $line =~ '1111' ]];then
((i++))
else
echo $line >> $i.txt
fi
done < data.txt
Split into several files as follows 分为以下几个文件
0.txt
aaaa
bbbb
cccc
1.txt
qqqq
2.txt
aaaa
dddd
But I want to get a more concise method, what should I do? 但是我想获得一种更简洁的方法,我该怎么办?
There is a utility built just for this. 有一个为此专门构建的实用程序。 Try:
尝试:
csplit -f '' -b'%d.txt' --suppress-matched data.txt /1111/ '{*}'
How it works: 这个怎么运作:
-f '' -b'%d.txt'
These two options tell csplit
to name the output files with single digits and .txt
at the end. 这两个选项告诉
csplit
用一个数字和末尾的.txt
命名输出文件。
--suppress-matched
This tells csplit
to omit the divider lines. 这告诉
csplit
省略分隔线。
data.txt
This is the file to divide up. 这是要分割的文件。
/1111/
This is the regex pattern to use as a divider. 这是用作分隔符的正则表达式模式。
{*}
This tells csplit
to divide as many times as it finds a divider line. 这告诉
csplit
分割次数等于找到分隔线的次数。
Does this work for you? 这对您有用吗?
awk 'BEGIN{num=0} /^1111/{num++} !/^1111/{print $0 >> num".txt"}' wantianye
I named the input file after your username, and it does what you ask with your sample data 我用用户名命名输入文件,它按照您的示例数据进行操作
awk 'BEGIN{num=0} # initialise num to 0
/^1111/{num++} # if the line begins with 1111, increment num
!/^1111/{print $0 >> num".txt"} # if the line DOESN'T begin with 1111, print it to num'.txt'
' wantianye
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.