使用awk或sed排列列？

Question

I have a file that has around 500 rows and 480K columns, I am required to move columns 2,3 and 4 at the end. 我有一个大约有500行和480K列的文件，我需要在最后移动第2、3和4列。 My file is a comma separated file, is there a quicker way to arrange this using awk or sed? 我的文件是用逗号分隔的文件，请问有没有更快的方法使用awk或sed来安排此文件？

Answer 1

您可以尝试以下解决方案-

perl -F"," -lane 'print "@F[0]"," ","@F[4..$#F]"," ","@F[1..3]"' input.file

Answer 2

You can copy the columns easily, moving will take too long for 480K columns. 您可以轻松复制列，对于480K列，移动将花费太长时间。

$ awk 'BEGIN{FS=OFS=","} {print $0,$2,$3,$4}' input.file > output.file

what kind of a data format is this? 这是什么样的数据格式？

Answer 3

Testing with 5 fields: 使用5个字段进行测试：

$ cat foo
1,2,3,4,5
a,b,c,d,e
$ cat program.awk
{
    $6=$2 OFS $3 OFS $4 OFS $1  # copy fields to the end and $1 too
    sub(/^([^,],){4}/,"")       # remove 4 first columns
    $1=$5 OFS $1                # catenate current $5 (was $1) to $1 
    NF=4                        # reduce NF
} 1                             # print

Run it: 运行：

$ awk -f program.awk FS=, OFS=, foo
1,5,2,3,4
a,e,b,c,d

So theoretically this should work: 因此从理论上讲这应该起作用：

{
    $480001=$2 OFS $3 OFS $4 OFS $1
    sub(/^([^,],){4}/,"")
    $1=$480000 OFS $1
    NF=479999 
} 1

EDIT: It did work. 编辑：它确实起作用。

Answer 4

Perhaps perl: 也许Perl：

perl -F, -lane 'print join(",", @F[0,4..$#F,1,2,3])' file

or 要么

perl -F, -lane '@x = splice @F, 1, 3; print join(",", @F, @x)' file

Another approach: regular expressions 另一种方法：正则表达式

perl -lpe 's/^([^,]+)(,[^,]+,[^,]+,[^,]+)(.*)/$1$3$2/' file

Timing it with a 500 line file, each line containing 480,000 fields 使用500行文件为其计时，每行包含480,000个字段

$ time perl -F, -lane 'print join(",", @F[0,4..$#F,1,2,3])' file.csv > file2.csv
40.13user 1.11system 0:43.92elapsed 93%CPU (0avgtext+0avgdata 67960maxresident)k
0inputs+3172752outputs (0major+16088minor)pagefaults 0swaps

$ time perl -F, -lane '@x = splice @F, 1, 3; print join(",", @F, @x)' file.csv > file2.csv
34.82user 1.18system 0:38.47elapsed 93%CPU (0avgtext+0avgdata 52900maxresident)k
0inputs+3172752outputs (0major+12301minor)pagefaults 0swaps

And pure text manipulation is the winner 纯文本操纵是赢家

$ time perl -lpe 's/^([^,]+)(,[^,]+,[^,]+,[^,]+)(.*)/$1$3$2/' file.csv > file2.csv
4.63user 1.36system 0:20.81elapsed 28%CPU (0avgtext+0avgdata 20612maxresident)k
0inputs+3172752outputs (0major+149866minor)pagefaults 0swaps

Answer 5

Another technique, just bash: 另一种技术，只是bash：

while IFS=, read -r a b c d e; do
    echo "$a,$e,$b,$c,$d"
done < file

使用awk或sed排列列？

问题描述

5 个解决方案

解决方案1
2 已采纳 2016-10-18 21:09:28

解决方案2
1 2016-10-18 19:06:47

解决方案3
1 2016-10-18 20:00:20

解决方案4
1 2016-10-18 20:49:46

解决方案5
1 2016-10-18 20:58:07

使用awk或sed排列列？

问题描述

5 个解决方案

解决方案1 2 已采纳 2016-10-18 21:09:28

解决方案2 1 2016-10-18 19:06:47

解决方案3 1 2016-10-18 20:00:20

解决方案4 1 2016-10-18 20:49:46

解决方案5 1 2016-10-18 20:58:07

解决方案1
2 已采纳 2016-10-18 21:09:28

解决方案2
1 2016-10-18 19:06:47

解决方案3
1 2016-10-18 20:00:20

解决方案4
1 2016-10-18 20:49:46

解决方案5
1 2016-10-18 20:58:07