简体   繁体   中英

Not empty file, but “wc -l” outputs 0

I have a non-empty file (even a big one, 400Ko ), that I can read with less .

But if I try to output the number of lines with wc -l /path/to/file it outputs 0 .

How can it be possible?

You can verify for yourself that the file contains no newline/linefeed (ASCII 10) characters, which would result in wc -l reporting 0 lines.

  1. First, count the characters in your file:

     wc -c /path/to/file

    You should get a non-zero value.

  2. Now, filter out everything that isn't a newline:

     tr -dc '\\n' /path/to/file | wc -c

    You should get back 0.

  3. Or, delete the newlines and count the result.

     tr -d '\\n' | wc -c

    You should get back the same value as in step 1.

wc counts number of '\\n' characters in the file. Could it be that your file does not contain one?

Here is the GNU source: https://www.gnu.org/software/cflow/manual/html_node/Source-of-wc-command.html

look for COUNT(c) macro.

Here's one way it's possible. Make a 400k file with just nulls in it:

dd if=/dev/zero bs=1024 count=400 of=/tmp/nulls ; ls -log /tmp/nulls 

Output shows the file exists:

400+0 records in
400+0 records out
409600 bytes (410 kB, 400 KiB) copied, 0.00343425 s, 119 MB/s
-rw-rw-r-- 1 409600 Feb 28 11:12 /tmp/nulls

Now count the lines:

wc -l /tmp/nulls
0 /tmp/nulls

It is possible if the HTML file is minified . The newline characters would have been removed during minification of the content.

Try with file command,

file filename.html

filename.html: HTML document text, UTF-8 Unicode text, with very long lines, with no line terminators

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM