简体   繁体   English

计算 a.txt 文件中的行数,得到预期结果的两倍

[英]Counting the Number of Lines in a .txt file, getting double the expected result

I'm trying to write a very basic script that will take a input file name and simply count the number of lines in the file, and print it to CMD.我正在尝试编写一个非常基本的脚本,它将采用输入文件名并简单地计算文件中的行数,并将其打印到 CMD。 I am getting double the number of lines that are actually in the file when I run it though.但是,当我运行它时,文件中实际的行数会增加一倍。

import sys


filename = sys.argv[-1]
with open(filename,) as f:
    LineCount = len(f.readlines())
print(LineCount)
input("Press Enter to close...")

The text file is 208 lines long, I am getting 417 back.文本文件长 208 行,我得到了 417。 Here is what the file looks like.这是文件的样子。 It just repeats from here on out.它只是从这里开始重复。

Asset Name              In Point            Description 
Zach And Jenv4          00:00:13:11                         
Zach And Jenv4          00:00:14:54                         
Zach And Jenv4          00:00:16:37                         
Zach And Jenv4          00:00:18:20                         
Zach And Jenv4          00:00:20:03                         
Zach And Jenv4          00:00:21:45                         
Zach And Jenv4          00:00:23:28                         
Zach And Jenv4          00:00:25:11                         
Zach And Jenv4          00:00:26:54                         
Zach And Jenv4          00:00:28:36                         
Zach And Jenv4          00:00:30:20                         
Zach And Jenv4          00:00:32:03                         
Zach And Jenv4          00:00:33:45                         
Zach And Jenv4          00:00:35:28                         
Zach And Jenv4          00:00:37:11                         
Zach And Jenv4          00:00:38:54                         
Zach And Jenv4          00:00:40:37                         
Zach And Jenv4          00:00:42:20                         
Zach And Jenv4          00:00:44:03                         
Zach And Jenv4          00:00:45:44                         
Zach And Jenv4          00:00:47:28                         
Zach And Jenv4          00:00:49:11                         
Zach And Jenv4          00:00:50:54                         

Here's a likely explanation, but OP should look at f.readlines() content to be sure.这是一个可能的解释,但 OP 应该查看f.readlines()内容以确定。

The file has \r\r\n line termination and the default for open is to translate \r , \n , and \r\n each to a newline when reading, so \r\r\n gets translated to \n\n .该文件具有\r\r\n行终止,并且open的默认值是在读取时将\r\n\r\n每个转换为换行符,因此\r\r\n被转换为\n\n One way to generate a file with these line terminations is to use Python's csv.writer without the documented newlines='' parameter when opening the file for writing on a Windows OS:生成具有这些行终止符的文件的一种方法是在打开文件以在 Windows 操作系统上写入时使用没有记录newlines=''参数的 Python 的csv.writer

import csv

# Create "bad" file
with open('test.csv','w') as f:  # should have newline='' as a parameter as well
    r = csv.writer(f)
    r.writerow(['a','b','c'])
    r.writerow([1,2,3])
    r.writerow([4,5,6])

# Read file as OP did
with open('test.csv') as f:
    data = f.readlines()

print(len(data))
print(data)

Output: Output:

6
['a,b,c\n', '\n', '1,2,3\n', '\n', '4,5,6\n', '\n']

With newline='' parameter added to the open :newline=''参数添加到open

3
['a,b,c\n', '1,2,3\n', '4,5,6\n']

Open the CSV file in Excel, Notepad or Notepad++ and you'll see the same double-newline issue, but dumping from the command line doesn't:在 Excel、Notepad 或 Notepad++ 中打开 CSV 文件,您将看到相同的双换行问题,但从命令行转储不会:

C:\>type test.csv
a,b,c
1,2,3
4,5,6

A hex editor will show the \r\r\n ( 0D 0D 0A in hexadecimal):十六进制编辑器将显示\r\r\n0D 0D 0A十六进制):

在此处输入图像描述

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM