如何将一个文本文件拆分为多个 *.txt 文件？

Question

I got a text file file.txt (12 MB) containing:我得到了一个文本文件file.txt (12 MB)，其中包含：

something1
something2
something3
something4
(...)

Is there a way to split file.txt into 12 *.txt files, let's say file2.txt , file3.txt , file4.txt , etc.?有没有办法将file.txt拆分为 12 个 *.txt 文件，比如file2.txt 、 file3.txt 、 file4.txt等？

Answer 1

You can use the Linux Bash core utility split :您可以使用 Linux Bash 核心实用程序split ：

split -b 1M -d  file.txt file

Note that M or MB both are OK but size is different.请注意， M或MB都可以，但大小不同。 MB is 1000 * 1000, M is 1024^2 MB 为 1000 * 1000，M 为 1024^2

If you want to separate by lines you can use -l parameter.如果要按行分隔，可以使用-l参数。

UPDATE更新

a=(`wc -l yourfile`) ; lines=`echo $(($a/12)) | bc -l` ; split -l $lines -d  file.txt file

Another solution as suggested by Kirill , you can do something like the following Kirill建议的另一种解决方案，您可以执行以下操作

split -n l/12 file.txt

Note that is l not one , split -n has a few options, like N , k/N , l/k/N , r/N , r/k/N .请注意， l not one ， split -n有几个选项，例如N ， k/N ， l/k/N ， r/N ， r/k/N 。

Answer 2

$ split -l 100 input_file output_file

where -l is the number of lines in each files.其中-l是每个文件中的行数。 This will create:这将创建：

output_fileaa输出文件aa
output_fileab output_fileab
output_fileac output_fileac
output_filead输出文件
.... ……

Answer 3

CS Pei's answer won't produce .txt files as the OP wants. CS Pei 的答案不会像 OP 想要的那样生成 .txt 文件。 Use:利用：

split -b=1M -d  file.txt file --additional-suffix=.txt

Answer 4

Using Bash :使用重击：

readarray -t lines < file.txt
count=${#lines[@]}

for i in "${!lines[@]}"; do
    index=$(( (i * 12 - 1) / count + 1 ))
    echo "${lines[i]}" >> "file${index}.txt"
done

Using AWK :使用AWK ：

awk '{
    a[NR] = $0
}
END {
    for (i = 1; i in a; ++i) {
        x = (i * 12 - 1) / NR + 1
        sub(/\..*$/, "", x)
        print a[i] > "file" x ".txt"
    }
}' file.txt

Unlike split , this one makes sure that the number of lines are most even.与split不同，这确保行数最均匀。

Answer 5

Regardless to what was said in previous answers, on my Ubuntu 16.04 (Xenial Xerus) I had to do:不管之前的回答中说了什么，在我的Ubuntu 16.04 (Xenial Xerus) 上，我必须这样做：

split -b 10M -d  system.log system_split.log

Please note the space between -b and the value.请注意-b和值之间的空格。

Answer 6

Try something like this:尝试这样的事情：

awk -vc=1 'NR%1000000==0{++c}{print $0 > c".txt"}' Datafile.txt

for filename in *.txt; do mv "$filename" "Prefix_$filename"; done;

Answer 7

I agree with @CS Pei, however this didn't work for me:我同意@CS Pei，但这对我不起作用：

split -b=1M -d file.txt file

...as the = after -b threw it off. ...因为-b之后的=把它扔掉了。 Instead, I simply deleted it and left no space between it and the variable, and used lowercase "m":相反，我只是删除了它，并且在它和变量之间没有空格，并使用小写的“m”：

split -b1m -d file.txt file

And to append ".txt", we use what @schoon said:并附加“.txt”，我们使用@schoon所说的：

split -b=1m -d file.txt file --additional-suffix=.txt

I had a 188.5MB txt file and I used this command [but with -b5m for 5.2MB files], and it returned 35 split files all of which were txt files and 5.2MB except the last which was 5.0MB.我有一个 188.5MB 的 txt 文件，我使用了这个命令 [但使用-b5m处理 5.2MB 文件]，它返回了 35 个拆分文件，所有这些文件都是 txt 文件和 5.2MB，除了最后一个是 5.0MB。 Now, since I wanted my lines to stay whole, I wanted to split the main file every 1 million lines, but the split command didn't allow me to even do -100000 let alone " -1000000 , so large numbers of lines to split will not work.现在，因为我希望我的行保持完整，所以我想每 100 万行拆分一次主文件，但是split命令甚至不允许我执行-100000更不用说 " -1000000 ，所以要拆分大量行不管用。

Answer 8

On my Linux system (Red Hat Enterprise 6.9), the split command does not have the command-line options for either -n or --additional-suffix .在我的 Linux 系统（Red Hat Enterprise 6.9）上， split命令没有-n或--additional-suffix的命令行选项。

Instead, I've used this:相反，我使用了这个：

split -d -l NUM_LINES really_big_file.txt split_files.txt.

where -d is to add a numeric suffix to the end of the split_files.txt.其中-d是在split_files.txt. and -l specifies the number of lines per file. -l指定每个文件的行数。

For example, suppose I have a really big file like this:例如，假设我有一个非常大的文件，如下所示：

$ ls -laF
total 1391952
drwxr-xr-x 2 user.name group         40 Sep 14 15:43 ./
drwxr-xr-x 3 user.name group       4096 Sep 14 15:39 ../
-rw-r--r-- 1 user.name group 1425352817 Sep 14 14:01 really_big_file.txt

This file has 100,000 lines, and I want to split it into files with at most 30,000 lines.该文件有 100,000 行，我想将其拆分为最多 30,000 行的文件。 This command will run the split and append an integer at the end of the output file pattern split_files.txt.此命令将运行拆分并在输出文件模式split_files.txt. . .

$ split -d -l 30000 really_big_file.txt split_files.txt.

The resulting files are split correctly with at most 30,000 lines per file.生成的文件被正确拆分，每个文件最多 30,000 行。

$ ls -laF
total 2783904
drwxr-xr-x 2 user.name group        156 Sep 14 15:43 ./
drwxr-xr-x 3 user.name group       4096 Sep 14 15:39 ../
-rw-r--r-- 1 user.name group 1425352817 Sep 14 14:01 really_big_file.txt
-rw-r--r-- 1 user.name group  428604626 Sep 14 15:43 split_files.txt.00
-rw-r--r-- 1 user.name group  427152423 Sep 14 15:43 split_files.txt.01
-rw-r--r-- 1 user.name group  427141443 Sep 14 15:43 split_files.txt.02
-rw-r--r-- 1 user.name group  142454325 Sep 14 15:43 split_files.txt.03


$ wc -l *.txt*
    100000 really_big_file.txt
     30000 split_files.txt.00
     30000 split_files.txt.01
     30000 split_files.txt.02
     10000 split_files.txt.03
    200000 total

Answer 9

If each part has the same number of lines, for example 22, here is my solution:如果每个部分的行数相同，例如 22，这是我的解决方案：

split --numeric-suffixes=2 --additional-suffix=.txt -l 22 file.txt file

And you obtain file2.txt with the first 22 lines, file3.txt the 22 next line, etc.并且您获得file2.txt的前 22 行， file3.txt的 22 下一行，依此类推。

Thank @hamruta-takawale, @dror-s and @stackoverflowuser2010感谢@hamruta-takawale、@dror-s 和 @stackoverflowuser2010

Answer 10

My search of how to do this led me here, so I'm posting this here for others too:我对如何做到这一点的搜索把我带到了这里，所以我也在这里为其他人发布这个：

To get all of the contents of the file, split is the right answer!要获取文件的所有内容， split是正确的答案！ But, for those looking to just extract a piece of a file, as a sample of the file, use head or tail :但是，对于那些只想提取文件的一部分的人，作为文件的样本，请使用head或tail ：

# extract just the **first** 100000 lines of /var/log/syslog into 
# ~/syslog_sample.txt
head -n 100000 /var/log/syslog > ~/syslog_sample.txt

# extract just the **last** 100000 lines of /var/log/syslog into 
# ~/syslog_sample.txt
tail -n 100000 /var/log/syslog > ~/syslog_sample.txt

如何将一个文本文件拆分为多个 *.txt 文件？

问题描述

10 个解决方案

解决方案1
85 已采纳 2013-09-26 14:37:42

解决方案2
80 2014-09-18 11:08:57

解决方案3
29 2016-05-19 09:52:52

解决方案4
2 2013-09-26 15:03:03

解决方案5
1 2017-01-26 12:04:39

解决方案6
0 2017-04-03 14:33:29

解决方案7
0 2017-09-29 18:56:17

解决方案8
0 2018-09-14 22:49:39

解决方案9
0 2020-04-23 16:47:46

解决方案10
0 2022-07-16 23:15:35

如何将一个文本文件拆分为多个 *.txt 文件？

问题描述

10 个解决方案

解决方案1 85 已采纳 2013-09-26 14:37:42

解决方案2 80 2014-09-18 11:08:57

解决方案3 29 2016-05-19 09:52:52

解决方案4 2 2013-09-26 15:03:03

解决方案5 1 2017-01-26 12:04:39

解决方案6 0 2017-04-03 14:33:29

解决方案7 0 2017-09-29 18:56:17

解决方案8 0 2018-09-14 22:49:39

解决方案9 0 2020-04-23 16:47:46

解决方案10 0 2022-07-16 23:15:35

解决方案1
85 已采纳 2013-09-26 14:37:42

解决方案2
80 2014-09-18 11:08:57

解决方案3
29 2016-05-19 09:52:52

解决方案4
2 2013-09-26 15:03:03

解决方案5
1 2017-01-26 12:04:39

解决方案6
0 2017-04-03 14:33:29

解决方案7
0 2017-09-29 18:56:17

解决方案8
0 2018-09-14 22:49:39

解决方案9
0 2020-04-23 16:47:46

解决方案10
0 2022-07-16 23:15:35