简体   繁体   English

如何组合目录中的所有文件,将它们各自的文件名添加为最终合并文件中的新列

[英]How to combine all files in a directory, adding their individual file names as a new column in final merged file

I have a directory with files that looks like this:我有一个目录,其中的文件如下所示:

CCG02-215-WGS.format.flt.txt
CCG05-707-WGS.format.flt.txt
CCG06-203-WGS.format.flt.txt
CCG04-967-WGS.format.flt.txt
CCG05-710-WGS.format.flt.txt
CCG06-215-WGS.format.flt.txt

Contents of each files look like this每个文件的内容如下所示

1   9061390 14  93246140
1   58631131    2   31823410
1   108952511   3   110694548
1   168056494   19  23850376
etc...

Ideal output would be a file, let's call it all-samples.format.flt.txt, that would contain the concatenation of all files, but an additional column that displays which sample/file the row came from ( some minor formatting involved to remove the.format.flt.txt ):理想的 output 将是一个文件,我们称之为 all-samples.format.flt.txt,它将包含所有文件的连接,但是一个额外的列显示该行来自哪个样本/文件(一些小的格式涉及删除.format.flt.txt ):

1   9061390 14  93246140    CCG02-215-WGS
...
1   58631131    2   31823410    CCG05-707-WGS
...
1   108952511   3   110694548   CCG06-203-WGS
...
1   168056494   19  23850376    CCG04-967-WGS

Currently, I have the following code which works for individual files.目前,我有以下适用于单个文件的代码。

awk 'BEGIN{OFS="\t"; split(ARGV[1],f,".")}{print $1,$2,$3,$4,f[1]}' CCG05-707-WGS.format.flt.txt

#OUTPUT

1   58631131    2   31823410    CCG05-707-WGS
...

However, when I try to apply it to all files, using the star, it adds the first filename it finds to all the files as the 4th column.但是,当我尝试使用星号将其应用于所有文件时,它会将找到的第一个文件名作为第 4 列添加到所有文件中。

awk 'BEGIN{OFS="\t"; split(ARGV[1],f,".")}{print $1,$2,$3,$4,f[1]}' *

#OUTPUT, 4th column should be as seen in previous code block

1   9061390 14  93246140    CCG02-215-WGS
...
1   58631131    2   31823410    CCG02-215-WGS
...
1   108952511   3   110694548   CCG02-215-WGS
...
1   168056494   19  23850376    CCG02-215-WGS

I feel like the solution may just lie in adding an additional parameter to awk... but I'm not sure where to start.我觉得解决方案可能只是在 awk 中添加一个附加参数......但我不确定从哪里开始。

Thanks!谢谢!

UPDATE更新

Using OOTB awk var FILENAME solved the issue, plus some elegant formatting logic for the file names.使用 OOTB awk var FILENAME 解决了这个问题,加上文件名的一些优雅的格式化逻辑。

Thank @RavinderSingh13!感谢@RavinderSingh13!

awk 'BEGIN{OFS="\t"} FNR==1{file=FILENAME;sub(/..*/,"",file)} {print $0,file}' *.txt awk 'BEGIN{OFS="\t"} FNR==1{file=FILENAME;sub(/..*/,"",file)} {print $0,file}' *.txt

You may use:您可以使用:

Any version awk :任何版本awk

awk -v OFS='\t' 'FNR==1{split(FILENAME, a, /\./)} {print $0, a[1]}' *.txt

Or in gnu-awk:或者在 gnu-awk 中:

awk -v OFS='\t' 'BEGINFILE{split(FILENAME, a, /\./)} {print $0, a[1]}' *.txt

With your shown samples please try following awk code.对于您展示的样品,请尝试遵循awk代码。 We need to use FILENAME OOTB variable here of awk .我们需要在这里使用awkFILENAME OOTB 变量。 Then whenever there is first line of any txt file(all txt files passed to this program) then remove everything from .然后,只要有任何 txt 文件的第一行(所有 txt 文件传递给该程序),然后从. to till last of value and in main program printing current line followed by file(file's name as per requirement)到最后一个值并在主程序中打印当前行,后跟文件(文件名根据要求)

awk '
BEGIN { OFS="\t" }
FNR==1{
  file=FILENAME
  sub(/\..*/,"",file)
}
{
  print $0,file
}
' *.txt

OR in a one-liner form try following awk code:以单行形式尝试遵循awk代码:

awk 'BEGIN{OFS="\t"} FNR==1{file=FILENAME;sub(/\..*/,"",file)} {print $0,file}' *.txt

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 在Linux目录中创建新文件,其中包含该目录中文件的名称 - Create new file in Linux directory with names of files in that directory 使用文本文件(包含文件名)将文件从当前目录复制到新目录 - Use a text file (containing file names) to copy files from current directory to new directory 如何在Linux的目录中合并多个文件,以使每个文件数据都放在新列中? - How do I combine multiple files in a directory in linux such that each files data is placed in a new column? 如何对目录中的所有文件执行命令,并将每个文件的输出移至新文件夹? - How do I conduct a command on all the files in a directory and move the output for each file to a new folder? 是否有 linux 命令循环遍历目录中的文件,然后将所有文件名和内容写入 csv 文件? - Is there a linux command loop through files in a directory then write all files names and content into csv file? 将目录中文件的多个文件名连接到一个字符串 - Join multiple file names of files in a directory to a string 如何将所有文件合并为一个文件,如我所愿 - How to combine all files into one file, as I want 合并文件名和内容 - Combine file names and content 如何在列表中查找包含使用部分名称的目录和子目录中的所有文件,然后将它们复制到新文件夹 - How to find all files in a directory and subdirectories that contain using partial names within a list then copy them to a new folder 如何在linux中重命名多个文件并将旧文件名与新文件名存储在文本文件中? - How to rename multiple files in linux and store the old file names with the new file name in a text file?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM