简体   繁体   English

将文件一分为二,但保留行

[英]Split a file into two but preserve lines

I am struggling with splitting a text file into two.我正在努力将文本文件分成两个。

For example this splits the file into two but if there is an uneven number of lines it creates a third file:例如,这会将文件分成两部分,但如果行数不均匀,则会创建第三个文件:

for line in infile:
        count_lines += 1
lines_per_file = int(count_lines / 2)
        subprocess.call(['split', '-l', str(lines_per_file), '--numeric-suffixes', infile, chunk_destination])

While this splits the file into two but cuts lines in half:虽然这会将文件分成两部分,但会将行减半:

subprocess.call(['split', '-n', '1/2', '--numeric-suffixes', infile, chunk_destination])

Is there a relatively simple way of splitting a file into two with Python or Bash that will add the extra line (if number of lines is uneven) into one of the two existing files instead of making a third or splitting into two files but preserving lines?有没有一种相对简单的方法将文件分成两个 Python 或 Bash 将额外的行(如果行数不均匀)添加到两个现有文件之一中,而不是制作第三个分成两个文件但保留行?

Actually newer version of split has an option to preserve lines:实际上,较新版本的 split 可以选择保留行:

CHUNKS may be: N split into N files based on size of input K/N output Kth of N to stdout l/N split into N files without splitting lines/records l/K/N output Kth of N to stdout without splitting lines/records r/N like 'l' but use round robin distribution r/K/N likewise but only output Kth of N to stdout CHUNKS 可能是: N 根据输入的大小拆分为 N 个文件 K/N output Kth of N 到 stdout l/N 拆分为 N 个文件而不拆分行/记录 l/K/N output Kth of N 到 stdout 不拆分行/记录 r/N 像 'l' 但同样使用循环分布 r/K/N 但只有 output Kth of N 到标准输出

so this works:所以这有效:

subprocess.call(['split', '-n', 'l/2', '--numeric-suffixes', infile, chunk_destination])

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM