awk vs nawk vs mawk processing heavy files

Question

I'm dealing with a few really large files which make macbook pro throttle. I was thinking about using faster implementations of awk. I have heard awk is much faster. Can I just install mawk, change awk syntax to mawk and use it? Will this simply speed up processing?

Answer 1

First, if you can, set LC_ALL=C and see if this provides enough boost:

$ LC_ALL=C awk 'foo'

mawk is quite fast, but I have found that it does not necessarily run awk scripts as expected -- I always need to double-check that it is doing the right thing.

gawk seems to me to have increased it's speed in the past few years -- ymmv.

Answer 2

mawk 1.9.9.6 (mawk-2 beta) is by far the fastest one.

I got to URI-quote-plus encoding much faster than even built-in module in python3. Nowadays, took my 2018 Mac about 13.9 seconds to traverse a 12.3 million row text file that's 1.82GB in size, and count out exactly every byte,

PLUS, every UTF-8 code point, all 1.2x billion of them,

despite itself not being Unicode-aware.

even gnu-awk in Unicode-aware mode or macOS built-in wc -lm doesn't go as fast.

awk vs nawk vs mawk processing heavy files

Question

2 answers

solution1
0 ACCPTED 2015-11-22 00:43:16

solution2
0 2021-02-02 23:51:43

awk vs nawk vs mawk processing heavy files

Question

2 answers

solution1 0 ACCPTED 2015-11-22 00:43:16

solution2 0 2021-02-02 23:51:43

solution1
0 ACCPTED 2015-11-22 00:43:16

solution2
0 2021-02-02 23:51:43