在脚本中用awk将数据复制到文件

Question

我已将一些列数据复制到某个文件，然后尝试将一个列数据写入另一个文件。 但是我弄错了

这是我的输入文件：

,E2Bn9,2015-04-29 00:00:00-0500

['2C173'],E2BA8,2015-04-29 00:00:00-0500

['5A475','2C174'],E2BA8,2015-06-29 00:00:00-0400

我如下使用awk ， sed命令

sed -i 's/",/|/g' tempFile
awk -F '[|,]' '{ print "update table set cola = " $1 " where colb = " $2 " and colc = " $3 }' tempFile > updatestmt.cql

我得到的输出为

update table set cola = where colb = E2Bn9 and colc = 2015-04-29 00:00:00-0500

update table set cola = ['2C173'] where colb = E2BA8 and colc = 2015-04-29 00:00:00-0500

update table set cola = "['5A475' where colb =  '2C174'] and colc = E2BA8

前两行看起来不错，但最后一行却打印了错误的值。

我想要最后一行

update table set cola = "['5A475','2C174'] where colb =E2BA8 and colc = 2015-06-29 00:00:00-0400

Answer 1

使用FPAT GNU awk 4. *：

$ awk -v FPAT='([^,]*)|([[][^]]+[]])' '{print "update table set cola =", $1, "where colb =", $2, "and colc =", $3}' file
update table set cola =  where colb = E2Bn9 and colc = 2015-04-29 00:00:00-0500
update table set cola = ['2C173'] where colb = E2BA8 and colc = 2015-04-29 00:00:00-0500
update table set cola = ['5A475','2C174'] where colb = E2BA8 and colc = 2015-06-29 00:00:00-0400

参见http://www.gnu.org/software/gawk/manual/gawk.html#Splitting-By-Content 。

对于非gawk awks或gawk的4.0之前版本（获得现代gawk！），您可以使用：

$ cat tst.awk
{
    delete f
    nf = 0
    tail = $0
    while ( (tail!="") && match(tail,/([^,]*)|([[][^]]+[]])/) ) {
        f[++nf] = substr(tail,RSTART,RLENGTH)
        tail = substr(tail,RSTART+RLENGTH+1)
    }
    print "update table set cola =", f[1], "where colb =", f[2], "and colc =", f[3]
}

$ awk -f tst.awk file
update table set cola =  where colb = E2Bn9 and colc = 2015-04-29 00:00:00-0500
update table set cola = ['2C173'] where colb = E2BA8 and colc = 2015-04-29 00:00:00-0500
update table set cola = ['5A475','2C174'] where colb = E2BA8 and colc = 2015-06-29 00:00:00-0400

您可以使用$0代替f[]但是这会带来性能开销，因为每次将记录分配给$(++nf)都会重新分割记录，并且在某些情况下您以后想使用原始的$0 。

Answer 2

我选择了另一种方法，因此可以避免使用过于复杂的reg-exp，它可以与任何旧的awk一起使用。

# cat tst.awk
        {s="";}
$1!=""  {for(i=1;i<NF-1;i++)s=s (i==1?"":",") $i;}
        {printf("update table set cola = %s where colb = %s and colc = %s\n",s,$(NF-1),$NF);}

# awk -F, -f tst.awk yourinpfile
update table set cola =  where colb = E2Bn9 and colc = 2015-04-29 00:00:00-0500
update table set cola = ['2C173'] where colb = E2BA8 and colc = 2015-04-29 00:00:00-0500
update table set cola = ['5A475','2C174'] where colb = E2BA8 and colc = 2015-06-29 00:00:00-0400

我同意Ed的观点，没有循环我们有更好的解决方案，但是我可以重用我最初的假设，即$(NF-1)和$NF是固定的，以保持更简单的reg-exp。

        {s="";}
$1!=""  {s=$0;sub("," $(NF-1) "," $NF, "", s);}
        {printf("update table set cola = %s where colb = %s and colc = %s\n",s,$(NF-1),$NF);}

Answer 3

数据中的字段分隔符正引起问题，准确地说是第三行括号内的逗号。 一种解决方法可能是一个不同的sed，只有转换,以| 在第一个括号之外并使用FS='|' ：

sed -r 's/(.*\])?.*,/\1|/g'  yourfile | awk -F '|' ....

....代表您的awk脚本的其余部分。

Answer 4

如果仅在示例代码中引用了列表值，则可以尝试使用此sed；

sed "s/' *, *'/' '/g;s/\([^,]*\),\([^,]*\),\(.*\)/update table set cola = \1 where colb = \2 and colc = \3/;s/' '/','/g" file

在脚本中用awk将数据复制到文件

问题描述

4 个解决方案

解决方案1
4 2016-05-05 13:56:05

解决方案2
1 2016-05-06 08:44:40

解决方案3
0 2016-05-05 13:49:43

解决方案4
0 2016-05-05 17:11:34

在脚本中用awk将数据复制到文件

问题描述

4 个解决方案

解决方案1 4 2016-05-05 13:56:05

解决方案2 1 2016-05-06 08:44:40

解决方案3 0 2016-05-05 13:49:43

解决方案4 0 2016-05-05 17:11:34

解决方案1
4 2016-05-05 13:56:05

解决方案2
1 2016-05-06 08:44:40

解决方案3
0 2016-05-05 13:49:43

解决方案4
0 2016-05-05 17:11:34