如何将多个文件按几列相交

Question

I have spent a lot of time on this any help would be appreciated.我在这方面花了很多时间，任何帮助将不胜感激。 I have two files as below;我有两个文件如下； what I want to do is to search for every item of f1_col1 and f1_col2 separately inside the f2_col3 - if an item exists then save it and add its related row from the (f2_col3) to a new column in the new df.我想要做的是在f2_col3中分别搜索f1_col1和f1_col2的每个项目 - 如果项目存在，则保存它并将其相关行从(f2_col3)添加到新 df 中的新列中。

f1:(two columns) f1：（两列）

f1_col1,f1_col2
kctd,Malat1
Gas5,Snhg6

f2:(three columns) f2：（三列）

f2_col1,f2_col2,f2_col3
chr7,snRNA,Gas5
chr1,protein_coding,Malat1
chr2,TEC,Snhg6
chr1,TEC,kctd

So based on the two files mentioned the desired output should be:因此，基于提到的两个文件，所需的 output 应该是：

new_df:新的_df：

f1_col1,f1_col2,f2_col1,f2_col1
kctd,Malat1,chr1,chr1
Gas5,Snhg6,chr7,chr2

note: f2_col2 is not important.注意： f2_col2 并不重要。

I do not have a strong programming background and found this very difficult - Even though I have checked multiple sources but have not been able to develop a solution - any help is appreciated.我没有很强的编程背景，发现这非常困难 - 即使我检查了多个来源但无法开发解决方案 - 感谢任何帮助。 Thanks谢谢

Answer 1

Based on 1 possible interpretation of your requirements and the 1 sunny-day example you provided where every key field always matches on every line, this MAY be what you're trying to do:基于对您的要求的 1 种可能解释和您提供的 1 个晴天示例，其中每个关键字段始终在每一行上匹配，这可能是您正在尝试做的事情：

$ cat tst.awk
BEGIN { FS=OFS="," }
NR==FNR {
    if ( FNR == 1 ) {
        hdr = $1
    }
    map[$3] = $1
    next
}
{ print $0, ( FNR>1 ? map[$1] OFS map[$2] : hdr OFS hdr ) }

$ awk -f tst.awk f2 f1
f1_col1,f1_col2,f2_col1,f2_col1
kctd,Malat1,chr1,chr1
Gas5,Snhg6,chr7,chr2

如何将多个文件按几列相交

问题描述

1 个解决方案

解决方案1
4 已采纳 2021-04-06 22:48:10

如何将多个文件按几列相交

问题描述

1 个解决方案

解决方案1 4 已采纳 2021-04-06 22:48:10

解决方案1
4 已采纳 2021-04-06 22:48:10