[英]How to find an entire sentences in a csv file and replace it with sentences from another file using bash?
So I have two files file1 and file2:所以我有两个文件file1和file2:
file1:
my name is xyz.
my name is abc.
I am a doctor.
I am an engineer.
I like dogs.
I like cats.
I want to replace some of these sentences by shorter sentences.我想用较短的句子替换其中一些句子。 So I have created another file named file2.csv
所以我创建了另一个名为 file2.csv 的文件
file2.csv:
"my name is xyz.","name xyz"
"my name is abc.","name abc"
"I am a doctor.","doctor"
"I like dogs.","dogs"
I have used sed so far and if input all these lines individually in the sed command they work perfectly however the the contents of file1 and file2 may change according to my needs and i want a solution that doesn't need changing the script or the code.到目前为止,我已经使用了 sed,如果在 sed 命令中单独输入所有这些行,它们可以完美运行,但是 file1 和 file2 的内容可能会根据我的需要而更改,我想要一个不需要更改脚本或代码的解决方案. Something like creating a 2 dimensional array and and then checking if the value in the first column of file2 exists in file1 and then replacing it with the corresponding entry in the second column of file2.csv
类似于创建一个二维数组,然后检查 file2 第一列中的值是否存在于 file1 中,然后将其替换为 file2.csv 第二列中的相应条目
So after I run the shell script file 1 should look like:所以在我运行 shell 脚本文件 1 后应该如下所示:
name xyz.
name abc.
doctor.
I am an engineer.
dogs.
I like cats.
Note that the contents in file1 and file 2 can change or new entries can be added and hence using something like请注意,文件 1 和文件 2 中的内容可以更改或可以添加新条目,因此使用类似
sed -i 's/I like dogs/dogs/' file1.csv
is not feasible.不可行。
With bash and sed:使用 bash 和 sed:
sed -f <(sed 's|","|/|; s|"|/|g; s|^|s|' file2.csv) file1
Output:输出:
name xyz name abc doctor I am an engineer. dogs I like cats.
The dot may be a problem because it is a special character in regex.点可能有问题,因为它是正则表达式中的特殊字符。
Using awk使用 awk
awk -F'"(,")?' '
NR==FNR { r[$2] = $3; next }
{ for (n in r) gsub(n, r[n]) } 1' file2.csv file1
-F'"(,")?'
is the field separator, matches a "
or ","
, so that we don't need to remove double quotes from fields,"
或","
,这样我们就不需要从字段中删除双引号,NR==FNR { r[$2] = $3; next }
NR==FNR { r[$2] = $3; next }
populates an array with content of file2.csv
using the full sentence as key and replacement string as value, NR==FNR { r[$2] = $3; next }
使用完整的句子作为键和替换字符串作为值填充一个包含file2.csv
内容的数组,{ for (n in r) gsub(n, r[n]) } 1
searches for each full sentence in each input record and replaces it with the replacement string. { for (n in r) gsub(n, r[n]) } 1
搜索每个输入记录中的每个完整句子并将其替换为替换字符串。A concise ruby script:一个简洁的 ruby 脚本:
ruby -rcsv -e '
sentences = CSV.read(ARGV.shift).to_h
File.foreach(ARGV.shift, chomp: true) {|line| puts sentences[line] || line}
' file2.csv file1
Using Perl One liner.使用 Perl One 衬垫。
$ cat file1
my name is xyz.
my name is abc.
I am a doctor.
I am an engineer.
I like dogs.
I like cats.
$ cat file2.csv
"my name is xyz.","name xyz"
"my name is abc.","name abc"
"I am a doctor.","doctor"
"I like dogs.","dogs"
$ perl -ne ' BEGIN {%kvp=map{chomp;s/\"//g;split "," } qx(cat file2.csv)} { chomp;print $kvp{$_}?"$kvp{$_}.\n":"$_\n"; } ' file1
name xyz.
name abc.
doctor.
I am an engineer.
dogs.
I like cats.
$
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.