简体   繁体   English

从文件中提取唯一单词并在 linux 中分隔的同一行选项卡中打印与其关联的数字

[英]Extract unique words from a file and print the numbers associated with it in the same line tab separated in linux

My file is of the form shown below:我的文件格式如下:

1: test
18: test
29: test
25: crazy
30: crazy

I want to ignore case and get the unique words in the file with their respective counts我想忽略大小写并获取文件中的唯一单词及其各自的计数

The desired output should be:所需的 output 应该是:

test: 1 18 29
crazy: 25 30

Could someone guide how can this be done in Linux/Bash?有人可以指导如何在 Linux/Bash 中做到这一点吗?

Could someone guide how can this be done in Linux/Bash?有人可以指导如何在 Linux/Bash 中做到这一点吗?

You can use awk's associative array to achieve that:您可以使用 awk 的关联数组来实现:

  • convert the 2nd field into upper or lowercase将第二个字段转换为大写或小写
  • build an associative array in awk在 awk 中构建关联数组
  • use the result in the 1st step as key and append the 1st field into the array将第一步中的结果用作键,并将 append 用作数组中的第一个字段
  • after all lines are processed by awk, you go thru your array, print out the keys and values在所有行都由 awk 处理后,你 go 通过你的数组,打印出键和值

This prints the desired output.这将打印所需的 output。

awk -F':' '{a[$2]=a[$2]" " $1}END{for(i in a) print i": " a[i]}' input_file.txt

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM