[英]Arrange columns using awk or sed?
我有一个大约有500行和480K列的文件,我需要在最后移动第2、3和4列。 我的文件是用逗号分隔的文件,请问有没有更快的方法使用awk或sed来安排此文件?
您可以尝试以下解决方案-
perl -F"," -lane 'print "@F[0]"," ","@F[4..$#F]"," ","@F[1..3]"' input.file
您可以轻松复制列,对于480K列,移动将花费太长时间。
$ awk 'BEGIN{FS=OFS=","} {print $0,$2,$3,$4}' input.file > output.file
这是什么样的数据格式?
使用5个字段进行测试:
$ cat foo
1,2,3,4,5
a,b,c,d,e
$ cat program.awk
{
$6=$2 OFS $3 OFS $4 OFS $1 # copy fields to the end and $1 too
sub(/^([^,],){4}/,"") # remove 4 first columns
$1=$5 OFS $1 # catenate current $5 (was $1) to $1
NF=4 # reduce NF
} 1 # print
运行:
$ awk -f program.awk FS=, OFS=, foo
1,5,2,3,4
a,e,b,c,d
因此从理论上讲这应该起作用:
{
$480001=$2 OFS $3 OFS $4 OFS $1
sub(/^([^,],){4}/,"")
$1=$480000 OFS $1
NF=479999
} 1
编辑:它确实起作用。
也许Perl:
perl -F, -lane 'print join(",", @F[0,4..$#F,1,2,3])' file
要么
perl -F, -lane '@x = splice @F, 1, 3; print join(",", @F, @x)' file
另一种方法:正则表达式
perl -lpe 's/^([^,]+)(,[^,]+,[^,]+,[^,]+)(.*)/$1$3$2/' file
使用500行文件为其计时,每行包含480,000个字段
$ time perl -F, -lane 'print join(",", @F[0,4..$#F,1,2,3])' file.csv > file2.csv
40.13user 1.11system 0:43.92elapsed 93%CPU (0avgtext+0avgdata 67960maxresident)k
0inputs+3172752outputs (0major+16088minor)pagefaults 0swaps
$ time perl -F, -lane '@x = splice @F, 1, 3; print join(",", @F, @x)' file.csv > file2.csv
34.82user 1.18system 0:38.47elapsed 93%CPU (0avgtext+0avgdata 52900maxresident)k
0inputs+3172752outputs (0major+12301minor)pagefaults 0swaps
纯文本操纵是赢家
$ time perl -lpe 's/^([^,]+)(,[^,]+,[^,]+,[^,]+)(.*)/$1$3$2/' file.csv > file2.csv
4.63user 1.36system 0:20.81elapsed 28%CPU (0avgtext+0avgdata 20612maxresident)k
0inputs+3172752outputs (0major+149866minor)pagefaults 0swaps
另一种技术,只是bash:
while IFS=, read -r a b c d e; do
echo "$a,$e,$b,$c,$d"
done < file
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.