处理多个文件并将它们附加在linux / unix中

Question

I have over 100 files with at least 5-8 columns (tab-separated) in each file. 我有100多个文件，每个文件中至少有5-8列（制表符分隔）。 I need to extract first three columns from each file and add fourth column with some predefined text and append them. 我需要从每个文件中提取前三列，并在第四列中添加一些预定义的文本并将其附加。

Let's say I have 3 files: file001.txt , file002.txt , file003.txt . 假设我有3个文件： file001.txt ， file002.txt和file003.txt 。

file001.txt : file001.txt ：

chr1 1 2 15
chr2 3 4 17

file002.txt : file002.txt ：

chr1 1 2 15
chr2 3 4 17

file003.txt : file003.txt ：

chr1 1 2 15
chr2 3 4 17

combined_file.txt : combined_file.txt ：

chr1 1 2 f1
chr2 3 4 f1
chr1 1 2 f2
chr2 3 4 f2
chr1 1 2 f3
chr2 3 4 f3

For simplicity I kept file contents same. 为简单起见，我将文件内容保持不变。 My script is as follows: 我的脚本如下：

#!/bin/bash
for i in {1..3}; do
j=$(printf '%03d' $i)
awk 'BEGIN { OFS="\t"}; {print $1,$2,$3}' file${j}.txt | awk -v k="$j" 'BEGIN {print $0"\t$k”}' | cat >> combined_file.txt
done

But the script is giving the following errors: 但是脚本给出了以下错误：

awk: non-terminated string $k”}... at source line 1 context is awk：源代码行1上下文中的非终止字符串$ k”} ...是

<<< awk: giving up source line number 2 awk: non-terminated string $k”}... at source line 1 context is <<< awk: giving up source line number 2 <<< awk：放弃源代码行2 awk：未终止的字符串$ k“} ...在源代码行1上下文中是<<< awk：放弃源代码行2

Can some one help me to figure it out? 有人可以帮我弄清楚吗？

Answer 1

You don't need two different awk scripts. 您不需要两个不同的awk脚本。 And you don't use $ to refer to variables in awk , that's used to refer to input fields (ie $k means access the field whose number is in the variable k ). 而且，您不使用$来引用awk中的变量，它用来引用输入字段（即$k表示访问其数字在变量k的字段）。

for i in {1..3}; do
    j=$(printf '%03d' $i)
    awk -v k="$j" -v OFS='\t' '{print $1, $2, $3, k}' file$j.txt
done > combined_file.txt

Answer 2

As pointed out in the comments your problem is youre trying to use odd characters as if they were double quotes. 正如评论中指出的那样，您的问题是您试图像使用双引号一样使用奇数字符。 Once you fix that though, you don't need a loop or any of that other complexity all you need is: 但是，一旦解决该问题，就不需要循环或其他任何复杂性：

$ awk 'BEGIN{FS=OFS="\t"} {$NF="f"ARGIND} 1' file*
chr1    1       2       f1
chr2    3       4       f1
chr1    1       2       f2
chr2    3       4       f2
chr1    1       2       f3
chr2    3       4       f3

The above used GNU awk for ARGIND. 上面将GNU awk用于ARGIND。

处理多个文件并将它们附加在linux / unix中

问题描述

2 个解决方案

解决方案1
3 已采纳 2016-07-08 22:12:04

解决方案2
1 2016-07-09 07:34:48

处理多个文件并将它们附加在linux / unix中

问题描述

2 个解决方案

解决方案1 3 已采纳 2016-07-08 22:12:04

解决方案2 1 2016-07-09 07:34:48

解决方案1
3 已采纳 2016-07-08 22:12:04

解决方案2
1 2016-07-09 07:34:48