在多个输入文件上使用awk

Question

我一直在处理一个bash脚本，并且在该脚本中的某个时刻，我一直在尝试弄清楚如何使用awk一次处理两个CSV文件，该文件将用于生成多个输出文件。 不久，就有一个主文件，该文件将要分发的内容保留到其他一些输出文件中，这些文件的名称和需要保留的记录数将从另一个文件派生。 前n记录将进入第一个输出文件，随后n+1到n+k进入第二个文件，依此类推。

为了更加清楚，这是一个主记录文件的外观示例：

x11,x21
x12,x22
x13,x23
x14,x24
x15,x25
x16,x26
x17,x27
x18,x28
x19,x29

以及其他文件的外观：

out_file_name_1,2
out_file_name_2,3
out_file_name_3,4

然后，第一个名为out_file_name_1输出文件应如下所示：

x11,x21
x12,x22

然后，第二个名为out_file_name_2输出文件应如下所示：

x13,x23
x14,x24
x15,x25

最后一个应该看起来像：

x16,x26
x17,x27
x18,x28
x19,x29

希望它已经足够清楚了。

Answer 1

自您提出以来，这是awk中的解决方案，但显然，三元组的答案是更好的方法。

$ cat oak.awk
BEGIN { FS = ","; fidx = 1 }

# Processing files.txt, init parallel arrays with filename and number of records
# to print to each one.
NR == FNR {
    file[NR] = $1
    records[NR] = $2
    next
}

# Processing main.txt. Print record to current file. Decrement number of records to print,
# advancing to the next file when number of records to print reaches 0
fidx in file && records[fidx] > 0 {
    print > file[fidx]
    if (! --records[fidx]) ++fidx
    next
}

# If we get here, either we ran out of files before reading all the records
# or a file was specified to contain zero records    
{ print "Error: Insufficient number of files or file with non-positive number of records"
  exit 1 }


$ cat files.txt
out_file_name_1,2
out_file_name_2,3
out_file_name_3,4

$ cat main.txt
x11,x21
x12,x22
x13,x23
x14,x24
x15,x25
x16,x26
x17,x27
x18,x28
x19,x29

$ awk -f oak.awk files.txt main.txt

$ cat out_file_name_1
x11,x21
x12,x22

$ cat out_file_name_2
x13,x23
x14,x24
x15,x25

$ cat out_file_name_3
x16,x26
x17,x27
x18,x28
x19,x29

Answer 2

我不会为此使用Awk。

while IFS=, read -u 3 filename lines; do
    head -n "$lines" >"$filename"
done 3<other.csv <main.csv

我相信，从特定文件描述符读取的read -u并不是完全可移植的，但是您的问题被标记为bash，所以我假设这里不是问题。

演示： http : //ideone.com/6FisHT

如果您在第一个文件之后得到空文件，则可以尝试使用其他read语句替换内部循环。

while IFS=, read -u 3 filename lines; do
    for i in $(seq 1 "$lines"); do
        read -r line
        echo "$line"
    done >"$filename"
done 3<other.csv <main.csv

在多个输入文件上使用awk

问题描述

2 个解决方案

解决方案1
1 2015-03-13 02:19:44

解决方案2
1 已采纳 2015-03-13 07:52:29

在多个输入文件上使用awk

问题描述

2 个解决方案

解决方案1 1 2015-03-13 02:19:44

解决方案2 1 已采纳 2015-03-13 07:52:29

解决方案1
1 2015-03-13 02:19:44

解决方案2
1 已采纳 2015-03-13 07:52:29