简体   繁体   中英

Problem with awk and (maybe) null characters

I have this file, which "may be" a binary file:

    DATA FIELDINFO Cloud_Mask_QA {{{
  rank: 2
  type: 20
  dims: Cell_Along_Swath_1km 2030, Cell_Across_Swath_1km 1354, 
  data: ... (2748620)
    (0,0) ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@
    (0,16) ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@
    (0,32) ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@
    (0,48) ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@
    (0,64) ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@
    (0,80) ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@
    (0,96) ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@
    (0,112) ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@
    (0,128) ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@
    (0,144) ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@
    (0,160) ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@
    (0,176) ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@
    (0,192) ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@, ^@

If I use sed -n "l" file , in order to see the "non printable characters" I get:

    DATA FIELDINFO Cloud_Mask_QA {{{$
  rank: 2$
  type: 20$
  dims: Cell_Along_Swath_1km 2030, Cell_Across_Swath_1km 1354, $
  data: ... (2748620)$
    (0,0) \000, \000, \000, \000, \000, \000, \000, \000, \000, \
   \000, \000, \000, \000, \000, \000, \000$
    (0,16) \000, \000, \000, \000, \000, \000, \000, \000, \000, \
   \000, \000, \000, \000, \000, \000, \000$
    (0,32) \000, \000, \000, \000, \000, \000, \000, \000, \000, \
   \000, \000, \000, \000, \000, \000, \000$
    (0,48) \000, \000, \000, \000, \000, \000, \000, \000, \000, \
    \000, \000, \000, \000, \000, \000, \000$
    (0,64) \000, \000, \000, \000, \000, \000, \000, \000, \000, \
    \000, \000, \000, \000, \000, \000, \000$
    (0,80) \000, \000, \000, \000, \000, \000, \000, \000, \000, \
    \000, \000, \000, \000, \000, \000, \000$
    (0,96) \000, \000, \000, \000, \000, \000, \000, \000, \000, \
    \000, \000, \000, \000, \000, \000, \000$
    (0,112) \000, \000, \000, \000, \000, \000, \000, \000, \000,\
    \000, \000, \000, \000, \000, \000, \000$
    (0,128) \000, \000, \000, \000, \000, \000, \000, \000, \000,\
    \000, \000, \000, \000, \000, \000, \000$
    (0,144) \000, \000, \000, \000, \000, \000, \000, \000, \000,\
    \000, \000, \000, \000, \000, \000, \000$
    (0,160) \000, \000, \000, \000, \000, \000, \000, \000, \000,\
    \000, \000, \000, \000, \000, \000, \000$
    (0,176) \000, \000, \000, \000, \000, \000, \000, \000, \000,\
    \000, \000, \000, \000, \000, \000, \000$
    (0,192) \000, \000, \000, \000, \000, \000, \000, \000, \000,\
    \000, \000, \000, \000, \000, \000, \000$

I am trying to use awk on it, but if I do awk '{print $0}' file , I get:

    DATA FIELDINFO Cloud_Mask_QA {{{
  rank: 2
  type: 20
  dims: Cell_Along_Swath_1km 2030, Cell_Across_Swath_1km 1354, 
  data: ... (2748620)
    (0,0) 

So it seems that awk stops processing the file at the first "^@" or "\\000" character it founds.

How can I avoid this?

Note: it seems my awk is mawk

gawk seems to solve the problem, instead of mawk . awk is generally linked to one of those two, so the only thing to do is to install gawk and use it instead of awk .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM