简体   繁体   English

生成新的Linux文件中的问题

[英]Issue in Generating new linux file

I am trying to just perform a vlookup from file 1 to file 2 using linux commands whereas i am getting proper results in wrong format. 我正在尝试使用linux命令从文件1到文件2执行vlookup,而我却得到了格式错误的正确结果。

To make it more clear, 为了更清楚一点

File 1 : 文件1:

http://www.amazon.com/dp/B00006IBAX test1_test  test3   test2   test4   test4   test4   test4   test4   test4   test4   test4   test4   test4   test4   test4   test4   test4   test4   test4   test4   test4   test4   test4   test4   test4   test4   test4

File 2 : 文件2:

http://www.amazon.com/dp/B00006IBAX

Desired Output : 所需输出:

  http://www.amazon.com/dp/B00006IBAX   test1_test  test3   test2   test4   test4   test4   test4   test4   test4   test4   test4   test4   test4   test4   test4   test4   test4   test4   test4   test4   test4   test4   test4   test4   test4   test4   test4

Code I used : 我使用的代码:

FNR==NR{s=$1; sub(".*"$2,"");a[s]=$0; next} a[$1]{OFS = "\t"; FS = "\t"; print $0 a[$1]}

Output I got : 我得到的输出:

http://www.amazon.com/dp/B00006IBAX_test    test3   test2   test4   test4   test4   test4   test4   test4   test4   test4   test4   test4   test4   test4   test4   test4   test4   test4   test4   test4   test4   test4   test4   test4   test4

There goes some mis-alignment and due to which i am not able to process the file. 会出现一些未对齐的情况,由于该原因,我无法处理该文件。 I want the data in file 1 exactly in file 2 if the lookup result succeeds. 如果查找结果成功,我希望文件1中的数据恰好在文件2中。 Please, help me on this 请帮我

No need for awk , there is join command: 不需要awk ,有join命令:

join -t$'\t' file1 file2

So, given your original inputs, you should now see: 因此,考虑到原始输入,您现在应该看到:

http://www.amazon.com/dp/B00006IBAX test1_test  test3   test2   test4   test4   test4   test4   test4   test4   test4   test4   test4   test4   test4   test4   test4   test4   test4   test4   test4   test4   test4   test4   test4   test4   test4   test4

How this works 如何运作

Excerpt from man join : man join节选:

For each pair of input lines with identical join fields, write a line to standard output. 对于具有相同联接字段的每对输入线,将一条线写入标准输出。 The default join field is the first, delimited by whitespace. 默认联接字段是第一个,由空格分隔。

  • -t specifies delimiter, in your case from your awk code and text files, looks like you are aiming to deal with tab-delimited files -t在您的情况下从awk代码和文本文件中指定分隔符,看起来您打算处理制表符分隔的文件
  • Since we don't want default whitespace as delimiter, we need to specify tab. 由于我们不希望将默认空格用作分隔符,因此需要指定tab。 But there is a trick to tabs, due to a quirk of join in that if we specified -t '\\t' , join seems to see \\t as literally two characters \\ and t , and gives an error. 但是制表符有一个窍门,这是由于join一个怪癖:如果我们指定-t '\\t'join似乎会将\\t看成字面上的两个字符\\t ,并给出一个错误。
  • So to specify tab, one way, is to do a literal tab, type -t ' then ctrl - v , then tab to insert a literal tab, then ' 因此,指定选项卡的一种方法是执行文字选项卡,键入-t '然后输入ctrl -v ,然后输入tab插入文字选项卡,然后输入'
  • or, what I feel is simpler, as we have done here, use C-style escaped tab -t$'\\t' 或者,我觉得更简单,就像我们在这里所做的那样,请使用C样式的转义制表符-t$'\\t'

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM