简体   繁体   English

用awk递归函数

[英]recursion function with awk

Given a csv describing firstname and lastname of parent-child relationship 给定一个描述父子关系的名字和姓氏的csv

$ cat /var/tmp/hier
F2 L2,F1 L1
F3 L3,F1 L1
F4 L4,F2 L2
F5 L5,F2 L2
F6 L6,F3 L3

I want to print: 我要打印:

F1 L1
    F2 L2
        F4 L4
        F5 L5
    F3 L3
        F6 L6

I wrote a script like below: 我写了一个如下脚本:

#!/bin/bash
print_node() {
        echo "awk -F, '\$2=="\"$@\"" {print \$1}' /var/tmp/hier"
        for node in `eval "awk -F, '\$2=="\"$@\"" {print \$1}'     /var/tmp/hier"`
        do
                echo -e "\t"$node
                print_node "$node"
        done
}
print_node "$1"

run the script: 运行脚本:

$ ./print_tree.sh "F1 L1"
awk -F, '$2=="F1 L1" {print $1}' /var/tmp/hier
awk: syntax error near line 1
awk: bailing out near line 1

It seemed that the awk command was malformed. awk命令似乎格式错误。 but if I run the command shown in the debug output, it works: 但是如果我运行调试输出中显示的命令,它将起作用:

$ awk -F, '$2=="F1 L1" {print $1}' /var/tmp/hier
F2 L2
F3 L3

Could anyone point out what is causing this error? 谁能指出是什么原因导致此错误?

======================= =======================
@moderatoors: I added perl and python tags because I'd welcome solutions in perl or python code. @moderatoors:我添加了perl和python标签,因为我欢迎使用perl或python代码的解决方案。 take no offence, please. 请不要犯罪。

With GNU awk for multi-dimensional arrays: 对于多维数组,使用GNU awk:

$ cat tst.awk
BEGIN { FS="," }
function descend(node) {
    printf "%*s%s\n", indent, "", node
    if ( isarray(map[node]) ) {
        indent += 3
        for (child in map[node]) {
            descend(child)
        }
        indent -= 3
    }
    return
}
NR==1 { root = $2 }
{ map[$2][$1] }
END { descend(root) }

$ awk -f tst.awk file
F1 L1
   F2 L2
      F4 L4
      F5 L5
   F3 L3
      F6 L6

I would personally reach for Perl here; 我个人将在这里联系Perl; you could also do Python (or any other similar-level language that happens to be there, like Ruby or Tcl, but Perl and Python are almost universally preinstalled). 您也可以使用Python(或碰巧存在的其他任何类似级别的语言,例如Ruby或Tcl,但Perl和Python几乎都已预先安装)。 I would use one of them since they have built-in nested data structures, which make it easy to cache the tree in navigable form, instead of re-parsing the parent links every time you want to fetch a node's children. 我将使用它们中的一个,因为它们具有内置的嵌套数据结构,这使得以可导航的形式缓存树变得容易,而不是每次要获取节点的子代时都重新解析父链接。 (GNU awk has arrays of arrays, but BSD awk doesn't.) (GNU awk具有数组数组,但BSD awk没有。)

Anyway, here's one perl solution: 无论如何,这是一个perl解决方案:

#!/usr/bin/env perl
use strict;
use warnings;

my %parent;

while (<>) {
  chomp;
  my ($child, $parent) = split ',';
  $parent{$child} = $parent;
}

my (%children, %roots);

while (my ($child, $parent) = each %parent) {
  push @{$children{$parent} ||= []}, $child;
  $roots{$parent} = 1 unless $parent{$parent};
}

foreach my $root (sort keys %roots) {
  show($root);
}

sub show {
  my ($node, $indent) = (@_,'');
  print "$indent$node\n";
  foreach my $child (sort(@{$children{$node}||[]})) {
    show($child, "    $indent");
  }
}

I saved the above as print_tree.pl and ran it like this on your data: 我将上面的print_tree.pl保存为print_tree.pl并在您的数据上运行了以下代码:

$ perl print_tree.pl *csv

You could also make it executable with chmod +x print_tree.pl and run it without explicitly calling perl : 您还可以使用chmod +x print_tree.pl使它可执行,并在不显式调用perl情况下运行它:

$ ./print_tree.pl *csv

Anyway, on your sample data, it produces this output: 无论如何,它会在您的样本数据上产生以下输出:

F1 L1
    F2 L2
        F4 L4
        F5 L5
    F3 L3
        F6 L6

alternative solution without multi dimensional awk arrays which works for this hierarchy 没有适用于此层次结构的多维awk数组的替代解决方案

join -t, -1 1 -2 2 inputfile{,} | awk -F, -f tree.awk

and awk script is as follows 和awk脚本如下

$ cat tree.awk 
    {
            s=$1;$1=$2;$2=s;
            t=""
            for (i=1;i<=NF;i++) {
                    if (! ($i in n)) {
                            print t $i
                            n[$i]
                    }
                    t=t "\t"
            }
    }

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM