简体   繁体   中英

Split every second occurrence of delimiter

I am trying to use awk to split a file every second occurrence of a delimiter, but I always end up with an empty file at the beginning and I can't understand why.

The data I need to break down in multiple files has a format similar to this:

----------
aaa
bbb
----------
ccc
ddd
----------
eee
fff
----------
ggg  

The first resulting file should contain:

----------
aaa
bbb
----------
ccc
ddd

The delimiter is always the same (10 times a 'minus' sign).
I am trying to do it like this for now:

awk -v RS='[-]{10}' '{i++} {file = sprintf("temp-%s", int(i/2)); print >> file;}'

The first file I get however (temp-0) always includes an empty line and nothing else.
Also, the source file does not start with an empty line, nor it has any in its content (they have been removed previously).

Can anybody please help?

I wouldn't play with RS for this problem. You can count the --------- to decide if you have to increment the file index. Give this line a try:

awk '/^--*$/{c++;f+=c%2?1:0}{print > "temp-"f}' file

Note that the above line gives you the idea of how to process the line and file index. If your file is huge, you need close() the file and using >> to redirect again, otherwise you will get errors like too many opened files .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM