How do you filter out all unique lines in a file?

Question

Is there a way to filter out all unique lines in a file via commandline tools without sorting the lines? I'd like to essentially do this:

sort -u myFile

without the performance hit of sorting.

Answer 1

Remove duplicated lines:

awk '!a[$0]++' file

This is famous awk one-liner. there are many explanations on inet. Here is one explanation:

This one-liner is very idiomatic. It registers the lines seen in the associative-array "a" (arrays are always associative in Awk) and at the same time tests if it had seen the line before. If it had seen the line before, then a[line] > 0 and !a[line] == 0. Any expression that evaluates to false is a no-op, and any expression that evals to true is equal to "{ print }".

How do you filter out all unique lines in a file?

Question

1 answers

solution1
18 ACCPTED 2013-04-03 20:33:04

How do you filter out all unique lines in a file?

Question

1 answers

solution1 18 ACCPTED 2013-04-03 20:33:04

solution1
18 ACCPTED 2013-04-03 20:33:04