[英]Bash Parse file using sed
I have a file that I want to parse using sed, but after many attemps, I didn't succeed. 我有一个要使用sed解析的文件,但是经过多次尝试,我没有成功。 This is the source file : 这是源文件:
. . exported "SCHEMA1"."IJK_ECX_LEDGER_HST_2009" 806.6 KB 25391 rows
. . exported "SCHEMA1"."IJK_ECX_JGEN_ACCT_ENTRY_HST_2009" 1.000 MB 25591 rows
. . exported "SCHEMA2"."IJK_ECX_JRNL_LN_HST_2009" 1.156 MB 25596 rows
. . exported "SCHEMA2"."IJK_ECX_OPEN_ITEM_GL_HST_2009" 663.4 KB 15062 rows
. . exported "SCHEMA1"."IJK_ECX_XLATITEM_HST_2009" 932.9 KB 42277 rows
. . exported "SCHEMA1"."IJK_ECX_JRNL_HEADER_HST_2009" 9.585 KB 4 rows
. . exported "SCHEMA5"."IJK_ECX_CA_JGEN_CHQ_HST_2009" 0 KB 0 rows
. . exported "SCHEMA1"."IJK_ECX_CA_JRNL_LN_HST_2009" 0 KB 0 rows
. . exported "SCHEMA5"."IJK_ECX_DISTRIB_LINE_HST_2009" 0 KB 0 rows
. . exported "SCHEMA1"."IJK_ECX_GP_ACC_LINE_HST_2009" 0 KB 0 rows
. . exported "SCHEMA5"."IJK_ECX_IN018_JRNL_H_HST_2009" 0 KB 0 rows
. . exported "SCHEMA1"."IJK_ECX_IN094_A_SUIV_HST_2009" 0 KB 0 rows
. . exported "SCHEMA5"."IJK_ECX_IN094_B_SUIV_HST_2009" 0 KB 0 rows
. . exported "SCHEMA5"."IJK_ECX_IN094_LN_AUD_HST_2009" 0 KB 0 rows
. . exported "SCHEMA0"."IJK_ECX_JGEN_ACT_HST_2009" 0 KB 0 rows
. . exported "SCHEMA1"."IJK_ECX_JGEN_CASH_HST_2009" 0 KB 0 rows
And this is what I want : 这就是我想要的:
IJK_ECX_LEDGER_HST_2009,25391
IJK_ECX_JGEN_ACCT_ENTRY_HST_2009,25591
IJK_ECX_JRNL_LN_HST_2009,25596
IJK_ECX_OPEN_ITEM_GL_HST_2009,15062
IJK_ECX_XLATITEM_HST_2009,42277
IJK_ECX_CA_JGEN_CHQ_HST_2009, 0
IJK_ECX_CA_JRNL_LN_HST_2009,0
IJK_ECX_DISTRIB_LINE_HST_2009,0
IJK_ECX_GP_ACC_LINE_HST_2009,0
IJK_ECX_IN018_JRNL_H_HST_2009,0
IJK_ECX_IN094_A_SUIV_HST_2009,0
IJK_ECX_IN094_B_SUIV_HST_2009,0
IJK_ECX_IN094_LN_AUD_HST_2009,0
IJK_ECX_JGEN_ACT_HST_2009,0
IJK_ECX_JGEN_CASH_HST_2009,0
The number after the comma corresponds to the number of rows. 逗号后的数字与行数相对应。 Do you have any idea how I could do this ? 你知道我该怎么做吗? Thanks for your help, 谢谢你的帮助,
Steve 史蒂夫
With awk: 使用awk:
awk '{printf "%s%s\n", $4, $7}' file | awk -F\" '{printf "%s,%s\n", $4,$5}'
IJK_ECX_LEDGER_HST_2009,25391
IJK_ECX_JGEN_ACCT_ENTRY_HST_2009,25591
IJK_ECX_JRNL_LN_HST_2009,25596
IJK_ECX_OPEN_ITEM_GL_HST_2009,15062
IJK_ECX_XLATITEM_HST_2009,42277
IJK_ECX_JRNL_HEADER_HST_2009,4
IJK_ECX_CA_JGEN_CHQ_HST_2009,0
IJK_ECX_CA_JRNL_LN_HST_2009,0
IJK_ECX_DISTRIB_LINE_HST_2009,0
IJK_ECX_GP_ACC_LINE_HST_2009,0
IJK_ECX_IN018_JRNL_H_HST_2009,0
IJK_ECX_IN094_A_SUIV_HST_2009,0
IJK_ECX_IN094_B_SUIV_HST_2009,0
IJK_ECX_IN094_LN_AUD_HST_2009,0
IJK_ECX_JGEN_ACT_HST_2009,0
IJK_ECX_JGEN_CASH_HST_2009,0
EDIT: If you run it without the second part the output looks like this: 编辑:如果您没有第二部分运行它,输出看起来像这样:
"SCHEMA1"."IJK_ECX_LEDGER_HST_2009"25391
To reach your desired output we have to split again with the second awk part. 为了获得所需的输出,我们必须再次使用第二个awk部分。 -F\\"
means split at "
and print only values 4 and 5, comma separated. -F\\"
表示在"
分割,并仅打印值4和5,以逗号分隔。
sed 's/^.*"."\([^"]*\)"[[:blank:]]\{1,\}\([^[:blank:]]\{1,\}[[:blank:]]\{1,\}\)\{2\}\([0-9]\{1,\}[[:blank:]].*/\1,\3/' YourFile
在GNU sed上添加-posix
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.