如何组合目录中的所有文件，将它们各自的文件名添加为最终合并文件中的新列

Question

I have a directory with files that looks like this:我有一个目录，其中的文件如下所示：

CCG02-215-WGS.format.flt.txt
CCG05-707-WGS.format.flt.txt
CCG06-203-WGS.format.flt.txt
CCG04-967-WGS.format.flt.txt
CCG05-710-WGS.format.flt.txt
CCG06-215-WGS.format.flt.txt

Contents of each files look like this每个文件的内容如下所示

1   9061390 14  93246140
1   58631131    2   31823410
1   108952511   3   110694548
1   168056494   19  23850376
etc...

Ideal output would be a file, let's call it all-samples.format.flt.txt, that would contain the concatenation of all files, but an additional column that displays which sample/file the row came from ( some minor formatting involved to remove the.format.flt.txt ):理想的 output 将是一个文件，我们称之为 all-samples.format.flt.txt，它将包含所有文件的连接，但是一个额外的列显示该行来自哪个样本/文件（一些小的格式涉及删除.format.flt.txt ）：

1   9061390 14  93246140    CCG02-215-WGS
...
1   58631131    2   31823410    CCG05-707-WGS
...
1   108952511   3   110694548   CCG06-203-WGS
...
1   168056494   19  23850376    CCG04-967-WGS

Currently, I have the following code which works for individual files.目前，我有以下适用于单个文件的代码。

awk 'BEGIN{OFS="\t"; split(ARGV[1],f,".")}{print $1,$2,$3,$4,f[1]}' CCG05-707-WGS.format.flt.txt

#OUTPUT

1   58631131    2   31823410    CCG05-707-WGS
...

However, when I try to apply it to all files, using the star, it adds the first filename it finds to all the files as the 4th column.但是，当我尝试使用星号将其应用于所有文件时，它会将找到的第一个文件名作为第 4 列添加到所有文件中。

awk 'BEGIN{OFS="\t"; split(ARGV[1],f,".")}{print $1,$2,$3,$4,f[1]}' *

#OUTPUT, 4th column should be as seen in previous code block

1   9061390 14  93246140    CCG02-215-WGS
...
1   58631131    2   31823410    CCG02-215-WGS
...
1   108952511   3   110694548   CCG02-215-WGS
...
1   168056494   19  23850376    CCG02-215-WGS

I feel like the solution may just lie in adding an additional parameter to awk... but I'm not sure where to start.我觉得解决方案可能只是在 awk 中添加一个附加参数......但我不确定从哪里开始。

Thanks!谢谢！

UPDATE更新

Using OOTB awk var FILENAME solved the issue, plus some elegant formatting logic for the file names.使用 OOTB awk var FILENAME 解决了这个问题，加上文件名的一些优雅的格式化逻辑。

Thank @RavinderSingh13!感谢@RavinderSingh13！

awk 'BEGIN{OFS="\t"} FNR==1{file=FILENAME;sub(/..*/,"",file)} {print $0,file}' *.txt awk 'BEGIN{OFS="\t"} FNR==1{file=FILENAME;sub(/..*/,"",file)} {print $0,file}' *.txt

Answer 1

You may use:您可以使用：

Any version awk :任何版本awk ：

awk -v OFS='\t' 'FNR==1{split(FILENAME, a, /\./)} {print $0, a[1]}' *.txt

Or in gnu-awk:或者在 gnu-awk 中：

awk -v OFS='\t' 'BEGINFILE{split(FILENAME, a, /\./)} {print $0, a[1]}' *.txt

Answer 2

With your shown samples please try following awk code.对于您展示的样品，请尝试遵循awk代码。 We need to use FILENAME OOTB variable here of awk .我们需要在这里使用awk的FILENAME OOTB 变量。 Then whenever there is first line of any txt file(all txt files passed to this program) then remove everything from .然后，只要有任何 txt 文件的第一行（所有 txt 文件传递给该程序），然后从. to till last of value and in main program printing current line followed by file(file's name as per requirement)到最后一个值并在主程序中打印当前行，后跟文件（文件名根据要求）

awk '
BEGIN { OFS="\t" }
FNR==1{
  file=FILENAME
  sub(/\..*/,"",file)
}
{
  print $0,file
}
' *.txt

OR in a one-liner form try following awk code:或以单行形式尝试遵循awk代码：

awk 'BEGIN{OFS="\t"} FNR==1{file=FILENAME;sub(/\..*/,"",file)} {print $0,file}' *.txt

如何组合目录中的所有文件，将它们各自的文件名添加为最终合并文件中的新列

问题描述

2 个解决方案

解决方案1
1 2022-08-17 18:04:06

解决方案2
1 已采纳 2022-08-17 18:05:37

如何组合目录中的所有文件，将它们各自的文件名添加为最终合并文件中的新列

问题描述

2 个解决方案

解决方案1 1 2022-08-17 18:04:06

解决方案2 1 已采纳 2022-08-17 18:05:37

解决方案1
1 2022-08-17 18:04:06

解决方案2
1 已采纳 2022-08-17 18:05:37