简体   繁体   English

按行内容拆分文件

[英]Split files by line content

I have a file with the following content. 我有一个包含以下内容的文件。

aaaa
bbbb
cccc
1111
qqqq
1111
aaaa
dddd

Split into multiple small files with 1111 as a separator.The method I tried is as follows. 用1111作为分隔符将其拆分为多个小文件。我尝试的方法如下。

#!/bin/bash
i=0
while read line  
do
        if [[ $line =~ '1111'  ]];then
                ((i++))
        else
                echo $line >> $i.txt
        fi
done < data.txt

Split into several files as follows 分为以下几个文件

0.txt
aaaa
bbbb
cccc

1.txt
qqqq

2.txt
aaaa
dddd

But I want to get a more concise method, what should I do? 但是我想获得一种更简洁的方法,我该怎么办?

There is a utility built just for this. 有一个为此专门构建的实用程序。 Try: 尝试:

csplit -f '' -b'%d.txt' --suppress-matched data.txt /1111/ '{*}'

How it works: 这个怎么运作:

  • -f '' -b'%d.txt'

    These two options tell csplit to name the output files with single digits and .txt at the end. 这两个选项告诉csplit用一个数字和末尾的.txt命名输出文件。

  • --suppress-matched

    This tells csplit to omit the divider lines. 这告诉csplit省略分隔线。

  • data.txt

    This is the file to divide up. 这是要分割的文件。

  • /1111/

    This is the regex pattern to use as a divider. 这是用作分隔符的正则表达式模式。

  • {*}

    This tells csplit to divide as many times as it finds a divider line. 这告诉csplit分割次数等于找到分隔线的次数。

Does this work for you? 这对您有用吗?

awk 'BEGIN{num=0} /^1111/{num++} !/^1111/{print $0 >> num".txt"}' wantianye

I named the input file after your username, and it does what you ask with your sample data 我用用户名命名输入文件,它按照您的示例数据进行操作

awk 'BEGIN{num=0}                # initialise num to 0
/^1111/{num++}                   # if the line begins with 1111, increment num
!/^1111/{print $0 >> num".txt"}  # if the line DOESN'T begin with 1111, print it to num'.txt'
' wantianye

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM