awk和（也许）空字符的问题

Question

I have this file, which "may be" a binary file: 我有这个文件，“可能是”二进制文件：

    DATA FIELDINFO Cloud_Mask_QA {{{
  rank: 2
  type: 20
  dims: Cell_Along_Swath_1km 2030, Cell_Across_Swath_1km 1354, 
  data: ... (2748620)
    (0,0) ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@
    (0,16) ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@
    (0,32) ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@
    (0,48) ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@
    (0,64) ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@
    (0,80) ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@
    (0,96) ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@
    (0,112) ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@
    (0,128) ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@
    (0,144) ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@
    (0,160) ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@
    (0,176) ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@
    (0,192) ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@

If I use sed -n "l" file , in order to see the "non printable characters" I get: 如果我使用sed -n "l" file ，为了查看“不可打印的字符”，我得到：

    DATA FIELDINFO Cloud_Mask_QA {{{$
  rank: 2$
  type: 20$
  dims: Cell_Along_Swath_1km 2030, Cell_Across_Swath_1km 1354, $
  data: ... (2748620)$
    (0,0) \000, \000, \000, \000, \000, \000, \000, \000, \000, \
   \000, \000, \000, \000, \000, \000, \000$
    (0,16) \000, \000, \000, \000, \000, \000, \000, \000, \000, \
   \000, \000, \000, \000, \000, \000, \000$
    (0,32) \000, \000, \000, \000, \000, \000, \000, \000, \000, \
   \000, \000, \000, \000, \000, \000, \000$
    (0,48) \000, \000, \000, \000, \000, \000, \000, \000, \000, \
    \000, \000, \000, \000, \000, \000, \000$
    (0,64) \000, \000, \000, \000, \000, \000, \000, \000, \000, \
    \000, \000, \000, \000, \000, \000, \000$
    (0,80) \000, \000, \000, \000, \000, \000, \000, \000, \000, \
    \000, \000, \000, \000, \000, \000, \000$
    (0,96) \000, \000, \000, \000, \000, \000, \000, \000, \000, \
    \000, \000, \000, \000, \000, \000, \000$
    (0,112) \000, \000, \000, \000, \000, \000, \000, \000, \000,\
    \000, \000, \000, \000, \000, \000, \000$
    (0,128) \000, \000, \000, \000, \000, \000, \000, \000, \000,\
    \000, \000, \000, \000, \000, \000, \000$
    (0,144) \000, \000, \000, \000, \000, \000, \000, \000, \000,\
    \000, \000, \000, \000, \000, \000, \000$
    (0,160) \000, \000, \000, \000, \000, \000, \000, \000, \000,\
    \000, \000, \000, \000, \000, \000, \000$
    (0,176) \000, \000, \000, \000, \000, \000, \000, \000, \000,\
    \000, \000, \000, \000, \000, \000, \000$
    (0,192) \000, \000, \000, \000, \000, \000, \000, \000, \000,\
    \000, \000, \000, \000, \000, \000, \000$

I am trying to use awk on it, but if I do awk '{print $0}' file , I get: 我正在尝试在其上使用awk，但是如果我执行awk '{print $0}' file ，则会得到：

    DATA FIELDINFO Cloud_Mask_QA {{{
  rank: 2
  type: 20
  dims: Cell_Along_Swath_1km 2030, Cell_Across_Swath_1km 1354, 
  data: ... (2748620)
    (0,0)

So it seems that awk stops processing the file at the first "^@" or "\\000" character it founds. 因此，awk似乎停止在找到的第一个“ ^ @”或“ \\ 000”字符处处理文件。

How can I avoid this? 如何避免这种情况？

Note: it seems my awk is mawk 注意：看来我的awk是mawk

Answer 1

gawk seems to solve the problem, instead of mawk . gawk似乎解决了问题，而不是mawk 。 awk is generally linked to one of those two, so the only thing to do is to install gawk and use it instead of awk . awk通常链接到这两个之一，因此唯一要做的是安装gawk并使用它而不是awk 。

awk和（也许）空字符的问题

问题描述

1 个解决方案

解决方案1
0 2019-01-23 09:46:57

awk和（也许）空字符的问题

问题描述

1 个解决方案

解决方案1 0 2019-01-23 09:46:57

解决方案1
0 2019-01-23 09:46:57