在GNU Linux（AWK / SED / GREP）中将CSV中的第三字段与模式文件匹配

Question

I need to print all the lines in a CSV file when 3rd field matches a pattern in a pattern file. 当第三个字段与模式文件中的模式匹配时，我需要打印CSV文件中的所有行。

I have tried grep with no luck because it matches with any field not only the third. 我尝试过grep时没有碰运气，因为它与任何字段都匹配，而不仅仅是第三个。

grep -f FILE2 FILE1 > OUTPUT

FILE1 文件1

dasdas,0,00567,1,lkjiou,85249
sadsad,1,52874,0,lkjiou,00567
asdasd,0,85249,1,lkjiou,52874
dasdas,1,48555,0,gfdkjh,06793
sadsad,0,98745,1,gfdkjh,45346
asdasd,1,56321,0,gfdkjh,47832

FILE2 文件2

RIGHT OUTPUT 正确的输出

dasdas,0,00567,1,lkjiou,85249
sadsad,0,98745,1,gfdkjh,45346

WRONG OUTPUT 错误的输出

dasdas,0,00567,1,lkjiou,85249
sadsad,1,52874,0,lkjiou,00567   <---- I don't want this to appear
sadsad,0,98745,1,gfdkjh,45346

I have already searched everywhere and tried different formulas. 我已经搜索了各处，并尝试了不同的公式。

EDIT: thanks to Wintermute, I managed to write something like this: 编辑：感谢Wintermute，我设法写了这样的东西：

csvquote file1.csv > file1.csv
awk -F '"' 'FNR == NR { patterns[$0] = 1; next } patterns[$6]' file2.csv file1.csv | csvquote -u > result.csv

Csvquote helps parsing CSV files with AWK. Csvquote帮助使用AWK解析CSV文件。

Thank you very much everybody, great community! 非常感谢大家，伟大的社区！

Answer 1

With awk: 使用awk：

awk -F, 'FNR == NR { patterns[$0] = 1; next } patterns[$3]' file2 file1

This works as follows: 其工作原理如下：

FNR == NR {           # when processing the first file (the pattern file)
  patterns[$0] = 1    # remember the patterns
  next                # and do nothing else
}
patterns[$3]          # after that, select lines whose third field
                      # has been seen in the patterns.

Answer 2

Using grep and sed : 使用grep和sed ：

grep -f <( sed -e 's/^\|$/,/g' file2) file1
dasdas,0,00567,1,lkjiou,85249
sadsad,0,98745,1,gfdkjh,45346

Explanation: 说明：

We insert a coma at the beginning and at the end of file2, but without changing the file, then we just grep as you were already doing. 我们在文件2的开头和结尾插入一个逗号，但是不更改文件，那么就像您已经做的那样，我们只是grep。

Answer 3

This can be a start 这可以是一个开始

for i in $(cat FILE2);do cat FILE1| 对于$（cat FILE2）中的i；执行cat FILE1 | cut -d',' -f3|grep $i ;done cut -d'，'-f3 | grep $ i;完成

Answer 4

sed 's#.*#/^[^,]*,[^,]*,&,/!d#' File2 >/tmp/File2.sed && sed -f /tmp/File2.sed FILE1;rm /tmp/File2.sed

hard in a simple sed like awk can do but should work if awk is not available 像awk一样可以在简单的sed中完成，但是如果awk不可用，则应该可以工作

same with egrep (usefull on huge file) 与egrep相同（在大文件上使用usefull）

sed 's#.*#^[^,]*,[^,]*,&,#' File2 >/tmp/File2.egrep && egrep -f /tmp/File2.egrep FILE1;rm /tmp/File2.egrep

在GNU Linux（AWK / SED / GREP）中将CSV中的第三字段与模式文件匹配

问题描述

4 个解决方案

解决方案1
5 2015-02-04 10:02:16

解决方案2
1 2015-02-04 10:27:35

解决方案3
0 2015-02-04 09:59:07

解决方案4
0 2015-02-04 10:13:14

在GNU Linux（AWK / SED / GREP）中将CSV中的第三字段与模式文件匹配

问题描述

4 个解决方案

解决方案1 5 2015-02-04 10:02:16

解决方案2 1 2015-02-04 10:27:35

解决方案3 0 2015-02-04 09:59:07

解决方案4 0 2015-02-04 10:13:14

解决方案1
5 2015-02-04 10:02:16

解决方案2
1 2015-02-04 10:27:35

解决方案3
0 2015-02-04 09:59:07

解决方案4
0 2015-02-04 10:13:14