简体   繁体   English

查找两个Bash阵列之间常见的项目

[英]Find items common between two Bash arrays

I have below shell script in which I have two arrays number1 and number2 . 我有下面的shell脚本,其中我有两个数组number1number2 I have a variable range which has list of numbers. 我有一个可变range ,其中包含数字列表。

Now I need to figure out what are all numbers which are in number1 array are also present in range variable. 现在我需要弄清楚number1数组中的所有数字在range变量中也存在。 Similarly for number2 array as well. 同样对于number2数组也是如此。 Below is my shell script and it is working fine. 下面是我的shell脚本,它运行正常。

number1=(1220 1374 415 1097 1219 557 401 1230 1363 1116 1109 1244 571 1347 1404)
number2=(411 1101 273 1217 547 1370 286 1224 1362 1091 567 561 1348 1247 1106 304 435 317)
range=90,197,521,540,552,554,562,569:570,573,576,579,583,594,597,601,608:609,611,628,637:638,640:641,644:648
range_f=" "$(eval echo $(echo $range | perl -pe 's/(\d+):(\d+)/{$1..$2}/g;s/,/ /g;'))" "
echo "$range_f"

for item in "${number1[@]}"; do
 if [[ $range_f =~ " $item " ]] ; then
 new_number1+=($item)
 fi
done
echo "new list: ${new_number1[@]}"

for item in "${number2[@]}"; do
 if [[ $range_f =~ " $item " ]] ; then
   new_number2+=($item)
 fi
done
echo "new list: ${new_number2[@]}"

Is there any better way to write above stuff? 有没有更好的方法来写上面的东西? As of now I have two for loops iterating and then figuring out new_number1 and new_number2 arrays. 截至目前,我有两个for循环迭代,然后计算new_number1new_number2数组。

Note: Numbers like 644:648 means, it starts with 644 and ends with 648. It is just short form. 注意:644:648这样的数字表示,它以644开头,以648结尾。它只是简短形式。

You can use comm with process substitution instead of looping: 您可以将comm用于进程替换而不是循环:

mapfile -t new_number1 < <(comm -12 <(printf '%s\n' "${number1[@]}" | sort) <(printf '%s\n' $range_f | sort))
mapfile -t new_number2 < <(comm -12 <(printf '%s\n' "${number2[@]}" | sort) <(printf '%s\n' $range_f | sort))
  • mapfile -t name reads from the nested process substitution into the named array mapfile -t name从嵌套进程替换到命名数组
  • printf ... | sort printf ... | sort pair provides the sorted input streams for comm printf ... | sort对为comm提供排序的输入流
  • comm -12 emits the items common to the two streams comm -12发出两个流共有的项

Aside from codeforester's answer, I can think of two other ways of doing this: 除了codeforester的答案,我还可以想到另外两种方法:

  1. Load the values of $range as keys of an associative array. 加载$range的值作为关联数组的键。 The values will be 1 . 值为1 Loop through each member of ${number1[@]} and ${number2[@]} , testing them against the values in the associative array. 遍历${number1[@]}${number2[@]}每个成员,根据关联数组中的值对它们进行测试。
  2. Use codeforester's printf ... | sort 使用codeforester的printf ... | sort printf ... | sort trick, but pipe both the list and the range through sort | uniq -c printf ... | sort技巧,但通过sort | uniq -c管道列表和范围 sort | uniq -c , then grep for the duplicates. sort | uniq -c ,然后grep为重复。

I'm not sure if either one of these is an actual improvement on your code. 我不确定这些中的任何一个是否是对代码的实际改进。 ... I would create a 'find duplicates' shell function, but otherwise your code looks solid. ...我会创建一个'find duplicates'shell函数,但是否则你的代码看起来很稳定。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM