Shell：如何根据每行的第一个单词提取第一个出现？

Question

我有一个记录特定操作的日志文件，其中“第一个单词”是操作的 ID，我想从中提取每个 ID 的第一次出现，以便显示每个 ID 的第一个操作。

我不确定我是否很清楚，所以假设我有一个文件监控一群人的行为，并且每次有人做某事时都会更新：

爱丽丝吃了一个苹果
夏娃睡着了
鲍勃看电视
鲍勃坐在椅子上
爱丽丝去了厨房
戴夫喝咖啡
卡罗尔买了一辆车
夏娃喂了猫
夏娃倒垃圾
戴夫洗了个澡
鲍勃洗碗
爱丽丝读了一本书
卡罗尔弹钢琴
...

假设我想看看每个人的第一个动作是什么，所以所需的 output 将是：

爱丽丝吃了一个苹果
夏娃睡着了
鲍勃看电视
戴夫喝咖啡
卡罗尔买了一辆车

我尝试了一些 uniq 和 grep 的组合，但是有一个问题：要使用 uniq 命令，我需要首先对行进行排序，这违背了我获得第一次出现的目的（这里的示例，“Eve fed the cat”将出现在“之前”夏娃睡着了”）

有没有更好的方法来实现这一目标？

感谢大家花时间阅读我。

Answer 1

使用 awk 这很简单：

$ awk '++arr[$1]==1' file

印刷：

Alice ate an apple 
Eve fell asleep 
Bob watched TV 
Dave drank coffee 
Carol bought a car

以这种方式工作：

awk '++arr[$1]==1' file
        ^           arr is an associative array with key/value combo
      ^             when created with $1 key (the first col) val is 0
      ^             ++before adds 1 before return value         
               ^    equal to
                 ^  1 meaning first time seen
    ^           ^   if this resolve true (col 1 seen first time) print

You can do this with other shell tools (Bash, Ruby, Perl, Python, etc.) but almost all easy solutions will use that tools version of an associative array that counts the number of times X has been seen.

Answer 2

惯用 awk ：

awk '!seen[$1]++' file

这使用关联数组和后缀增量运算符将第一个字段 $1 添加到名为“seen”的数组中。 第一次遇到该键时该值为零，因此可以在第一次看到该键时将其取反以返回 true。

Shell：如何根据每行的第一个单词提取第一个出现？

问题描述

2 个解决方案

解决方案1
4 已采纳 2021-04-20 12:28:30

解决方案2
0 2021-04-20 13:21:07

Shell：如何根据每行的第一个单词提取第一个出现？

问题描述

2 个解决方案

解决方案1 4 已采纳 2021-04-20 12:28:30

解决方案2 0 2021-04-20 13:21:07

解决方案1
4 已采纳 2021-04-20 12:28:30

解决方案2
0 2021-04-20 13:21:07