[英]Bash script to compare 2 files with different length strings
I have two files I am trying to compare the strings in each line by line. 我有两个文件,我试图逐行比较每个字符串。 File1 only contains a 6 character string prefix while File2 contains a 12 character string. File1仅包含6个字符的字符串前缀,而File2包含12个字符的字符串。 How can I loop through the File2 to find strings that start with the 6 characters from File1 and output those to a file? 如何遍历File2以查找以File1中的6个字符开头的字符串并将其输出到文件?
002379
005964
002379ED6212
003354EB4591
004679BB2185
005964AB3379
005964DB5496
awk
或许能够实现这一目标
awk 'NR == FNR {a[$0]; next};substr($0, 1, 6) in a' File1 File2
This awk one-liner does what you want: 这个awk单行做你想要的:
awk 'NR==FNR{a[$0];next}{for(i in a)if(substr($0,1,6)==i)print}' file1 file2
NR==FNR
is only true for the first file. NR==FNR
仅适用于第一个文件。 Each line of file1
is stored as a key in the array a
. file1
每一行都作为键存储在数组a
。 next
skips the other block. next
跳过另一个块。 For each record in the second file, loop through each of the keys in a
and compare the first 6 characters. 用于在所述第二文件中的每个记录,遍历每个键的a
和前6个字符进行比较。 If they are the same, print the record. 如果它们相同,则打印记录。
Output: 输出:
002379ED6212
005964AB3379
005964DB5496
grep -f <(sed 's/^/^/' file1) file2
It would be nice to just use grep -f
to find all the lines in file2 that match a regex in file1, but you want to anchor the regexes in file1 to the beginning of the line. 使用grep -f
来查找file2中与file1中的正则表达式匹配的所有行会很好,但是您希望将file1中的正则表达式锚定到行的开头。 So use the above to preprocess the strings by adding an anchor. 因此,使用上面的方法通过添加锚来预处理字符串。
For a pure-Bash solution . 对于纯Bash解决方案。 . 。 . 。 assuming you're using Bash v4.x, you can first populate an associative array whose keys are the lines of File1
: 假设您正在使用Bash v4.x,您可以首先填充其键是File1
行的关联数组 :
declare -A prefixes
while read prefix ; do
prefixes[$prefix]=1
done < File1
# Now ${prefixes[002379]} is 1, and ${prefixes[005964]} is 1, but
# ${prefixes[anything-else]} is undefined.
And then check the first six characters of each line of File2
to see if it's in this associative array: 然后检查File2
每行的前六个字符,看看它是否在这个关联数组中:
while read word do ;
prefix="${word:0:6}"
if [[ "${prefixes[$prefix]}" ]] ; then
echo "$word"
fi
done < File2
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.