Bash: How to count the number of occurrences of a string within a file?

Question

I have a file that looks something like this:

dog
cat
dog
dog
fish
cat

I'd like to write some kind of code in Bash to make the file formatted like:

dog:1
cat:1
dog:2
dog:3
fish:1
cat:2

Any idea on how to do this? The file is very large (> 30K lines), so the code should be somewhat fast.

I am thinking some kind of loop...

Like this:

while read line; 
     echo "$line" >> temp.txt
     val=$(grep $line temp.txt)
     echo "$val" >> temp2.txt
done < file.txt

And then paste -d ':' file1.txt temp2.txt

However, I am concerned that this would be really slow, as you're going line-by-line. What do other people think?

Answer 1

You may use this simple awk to do this job for you:

awk '{print $0 ":" ++freq[$0]}' file

dog:1
cat:1
dog:2
dog:3
fish:1
cat:2

Answer 2

Here's what I came up with:

declare -A arr; while read -r line; do ((arr[$line]++)); echo "$line:${arr[$line]}" >> output_file; done < input_file

First, declare hash table arr. Then read every line in a for loop and increment the value in the array with the key of the read line. Then echo out the line, followed out by the value in the hashtable. Lastly append into a file 'out'.

Answer 3

Awk or sed are very powerful but it's not bash, here is the bash variant

raw=( $(cat file) ) # read file
declare -A index    # init indexed array

for item in ${raw[@]}; { ((index[$item]++)); } # 1st loop through raw data to count items
for item in ${raw[@]}; { echo $item:${index[$item]}; } # 2nd loop change data

Bash: How to count the number of occurrences of a string within a file?

Question

3 answers

solution1
5 ACCPTED 2020-01-16 19:13:44

solution2
0 2020-01-16 19:39:22

solution3
0 2020-01-17 06:56:47

Bash: How to count the number of occurrences of a string within a file?

Question

3 answers

solution1 5 ACCPTED 2020-01-16 19:13:44

solution2 0 2020-01-16 19:39:22

solution3 0 2020-01-17 06:56:47

solution1
5 ACCPTED 2020-01-16 19:13:44

solution2
0 2020-01-16 19:39:22

solution3
0 2020-01-17 06:56:47