如何在linux中处理具有固定宽度列的文件

Question

I would like to process below file: 我想处理下面的文件：

01234000000000000000000+000000000000000000+
02586000000000000000000+000000000000000000-
12345000000000000000000+000000000000000000-
12122000000000000000000+000000000000000000+

I want to convert above file to: 我想将上面的文件转换为：

01234,000000000000000000+,000000000000000000+
02586,000000000000000000+,000000000000000000-
12345,000000000000000000+,000000000000000000-
12122,000000000000000000+,000000000000000000+

Input file have fixed width columns 5,19,19 respectively. 输入文件分别具有固定宽度列5,19,19。

I would like to solve using linux command. 我想解决使用linux命令。

I tried below command, but it is not working :( 我尝试下面的命令，但它不工作:(

awk 'BEGIN{FIELDWIDTHS="5 19 19";OFS=",";}{$1="$1,$2,$3"}' data.txt

Executing above command on ubuntu 14.04 LTS desktop OS, the output was nothing(blank). 在ubuntu 14.04 LTS桌面操作系统上执行上述命令，输出结果为空（空白）。

Answer 1

Your attempt was quite close, although you forgot to {print} : 虽然您忘了{print} ，但您的尝试非常接近：

awk 'BEGIN{FIELDWIDTHS="5 19 19";OFS=","}{$1=$1}1' file

{$1=$1} assigns the first field to itself, which is enough to make awk "touch" each record. {$1=$1}将第一个字段分配给自己，这足以使awk“触摸”每个记录。 I've used the shorthand 1 , which is the shortest true condition. 我用过速记1 ，这是最短的真实情况。 The default action is {print} . 默认操作是{print} 。

Note that FIELDWIDTHS is a GNU awk extension, so if you're using a different version, you will have to go with a different approach. 请注意， FIELDWIDTHS是一个GNU awk扩展，因此如果您使用的是其他版本，则必须采用不同的方法。 For example: 例如：

awk 'BEGIN{OFS=","}{print substr($0,1,5),substr($0,6,19),substr($0,25)}' file

Answer 2

$ sed -r 's/(.{5})(.{19})/\1,\2,/' file
01234,000000000000000000+,000000000000000000+
02586,000000000000000000+,000000000000000000-
12345,000000000000000000+,000000000000000000-
12122,000000000000000000+,000000000000000000+

Answer 3

that would be very easy: 这很容易：

sed -n 's/\(.\{5\}\)\(.\{19\}\)\(.\{19\}\)/\1,\2,\3/p' your_file

what it does, is to capture each line by 5, 19, 19 then print it out with , in between. 它做什么，是5，19，19来捕捉每一行，然后打印出来用,在两者之间。

$ echo 01234000000000000000000+000000000000000000+ | sed -n 's/\(.\{5\}\)\(.\{19\}\)\(.\{19\}\)/\1,\2,\3/p'
01234,000000000000000000+,000000000000000000+

Answer 4

Perl救援：

perl -pe 'for $p (5, 25) { substr $_, $p, 0, "," }' data.txt

Answer 5

this is suitable task for cut as well 这也是适合cut任务

$ cut --output-delimiter=',' -c1-5,6-24,25- data.txt
01234,000000000000000000+,000000000000000000+
02586,000000000000000000+,000000000000000000-
12345,000000000000000000+,000000000000000000-
12122,000000000000000000+,000000000000000000+

--output-delimiter=',' specify output field separator --output-delimiter=','指定输出字段分隔符
-c to select specified character(s) -c选择指定的字符
1-5 first field 1-5第一场
6-24 second field 6-24秒场
25- rest of the line 25-其余的线

Answer 6

awk '{sub(/.0/,",0")sub(/+/,"+,")}1' file

0123,000000000000000000+,000000000000000000+
0258,000000000000000000+,000000000000000000-
1234,000000000000000000+,000000000000000000-
1212,000000000000000000+,000000000000000000+

如何在linux中处理具有固定宽度列的文件

问题描述

6 个解决方案

解决方案1
3 已采纳 2015-03-18 17:01:29

解决方案2
3 2015-03-18 17:17:25

解决方案3
2 2015-03-18 16:59:13

解决方案4
2 2015-03-18 17:05:11

解决方案5
1 2017-10-08 07:47:22

解决方案6
0 2017-05-05 18:25:33

如何在linux中处理具有固定宽度列的文件

问题描述

6 个解决方案

解决方案1 3 已采纳 2015-03-18 17:01:29

解决方案2 3 2015-03-18 17:17:25

解决方案3 2 2015-03-18 16:59:13

解决方案4 2 2015-03-18 17:05:11

解决方案5 1 2017-10-08 07:47:22

解决方案6 0 2017-05-05 18:25:33

解决方案1
3 已采纳 2015-03-18 17:01:29

解决方案2
3 2015-03-18 17:17:25

解决方案3
2 2015-03-18 16:59:13

解决方案4
2 2015-03-18 17:05:11

解决方案5
1 2017-10-08 07:47:22

解决方案6
0 2017-05-05 18:25:33