简体   繁体   English

如何在 csv 文件中查找整个句子并使用 bash 将其替换为另一个文件中的句子?

[英]How to find an entire sentences in a csv file and replace it with sentences from another file using bash?

So I have two files file1 and file2:所以我有两个文件file1和file2:

file1:
my name is xyz.
my name is abc.
I am a doctor.
I am an engineer.
I like dogs.
I like cats.

I want to replace some of these sentences by shorter sentences.我想用较短的句子替换其中一些句子。 So I have created another file named file2.csv所以我创建了另一个名为 file2.csv 的文件

file2.csv:
"my name is xyz.","name xyz"
"my name is abc.","name abc"
"I am a doctor.","doctor"
"I like dogs.","dogs"

I have used sed so far and if input all these lines individually in the sed command they work perfectly however the the contents of file1 and file2 may change according to my needs and i want a solution that doesn't need changing the script or the code.到目前为止,我已经使用了 sed,如果在 sed 命令中单独输入所有这些行,它们可以完美运行,但是 file1 和 file2 的内容可能会根据我的需要而更改,我想要一个不需要更改脚本或代码的解决方案. Something like creating a 2 dimensional array and and then checking if the value in the first column of file2 exists in file1 and then replacing it with the corresponding entry in the second column of file2.csv类似于创建一个二维数组,然后检查 file2 第一列中的值是否存在于 file1 中,然后将其替换为 file2.csv 第二列中的相应条目

So after I run the shell script file 1 should look like:所以在我运行 shell 脚本文件 1 后应该如下所示:

name xyz.
name abc.
doctor.
I am an engineer.
dogs.
I like cats.

Note that the contents in file1 and file 2 can change or new entries can be added and hence using something like请注意,文件 1 和文件 2 中的内容可以更改或可以添加新条目,因此使用类似

sed -i 's/I like dogs/dogs/' file1.csv

is not feasible.不可行。

With bash and sed:使用 bash 和 sed:

sed -f <(sed 's|","|/|; s|"|/|g; s|^|s|' file2.csv) file1

Output:输出:

name xyz
name abc
doctor
I am an engineer.
dogs
I like cats.

The dot may be a problem because it is a special character in regex.点可能有问题,因为它是正则表达式中的特殊字符。

Using awk使用 awk

awk -F'"(,")?' '
  NR==FNR { r[$2] = $3; next }
  { for (n in r) gsub(n, r[n]) } 1' file2.csv file1
  • -F'"(,")?' is the field separator, matches a " or "," , so that we don't need to remove double quotes from fields,是字段分隔符,匹配一个""," ,这样我们就不需要从字段中删除双引号,
  • NR==FNR { r[$2] = $3; next } NR==FNR { r[$2] = $3; next } populates an array with content of file2.csv using the full sentence as key and replacement string as value, NR==FNR { r[$2] = $3; next }使用完整的句子作为键和替换字符串作为值填充一个包含file2.csv内容的数组,
  • { for (n in r) gsub(n, r[n]) } 1 searches for each full sentence in each input record and replaces it with the replacement string. { for (n in r) gsub(n, r[n]) } 1搜索每个输入记录中的每个完整句子并将其替换为替换字符串。

A concise ruby script:一个简洁的 ruby​​ 脚本:

ruby -rcsv -e '
    sentences = CSV.read(ARGV.shift).to_h
    File.foreach(ARGV.shift, chomp: true) {|line| puts sentences[line] || line}
' file2.csv file1

Using Perl One liner.使用 Perl One 衬垫。

$ cat file1
my name is xyz.
my name is abc.
I am a doctor.
I am an engineer.
I like dogs.
I like cats.

$ cat file2.csv
"my name is xyz.","name xyz"
"my name is abc.","name abc"
"I am a doctor.","doctor"
"I like dogs.","dogs"

$ perl -ne ' BEGIN {%kvp=map{chomp;s/\"//g;split "," } qx(cat file2.csv)} { chomp;print $kvp{$_}?"$kvp{$_}.\n":"$_\n"; } ' file1
name xyz.
name abc.
doctor.
I am an engineer.
dogs.
I like cats.

$

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 Python 或 Bash - 如何在两个句子中间添加文件中列出的单词并将 output 放入另一个文件? - Python or Bash - How do you add words listed in a file in the middle of two sentences and put the output into another file? 查找文本文件中每对句子之间的相似度 - Find the similarity between each pair of sentences in a text file 如何使用bash用linux中的空格替换csv文件中的第三个逗号? - How to replace the third comma in a csv file with a space in linux using bash? 如何使用bash找到某个文件中的哪些行不是由另一个文件中的行启动的? - How can I find which lines in a certain file are not started by lines from another file using bash? bash:如何用部分内容替换文本文件中的整行 - bash: How to replace an entire line in a text file by a part of its content 如何使用bash脚本从csv文件中读取特定整数? - How to read a specific integer from a csv file using bash script? 如何使用 Bash 使用另一个文本文件中存在的转换替换文本文件列中的名称 - How to Replace Names in Column of a Text File, Using a Conversion Present in Another Text File With Bash 仅当包含它的行还包含使用Bash在另一个文件中找到的数字时,如何替换文件中的字符串? - How to replace a string in a file only if the line containing it also contains a number found in another file using Bash? 在 Bash 中从另一个较大文件中查找文件行的最快方法 - Fastest way to find lines of a file from another larger file in Bash 用bash脚本中的sed替换另一个文件中的字符串 - Replace string in another file with sed from bash script
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM