[英]Counting the Number of Lines in a .txt file, getting double the expected result
I'm trying to write a very basic script that will take a input file name and simply count the number of lines in the file, and print it to CMD.我正在尝试编写一个非常基本的脚本,它将采用输入文件名并简单地计算文件中的行数,并将其打印到 CMD。 I am getting double the number of lines that are actually in the file when I run it though.但是,当我运行它时,文件中实际的行数会增加一倍。
import sys
filename = sys.argv[-1]
with open(filename,) as f:
LineCount = len(f.readlines())
print(LineCount)
input("Press Enter to close...")
The text file is 208 lines long, I am getting 417 back.文本文件长 208 行,我得到了 417。 Here is what the file looks like.这是文件的样子。 It just repeats from here on out.它只是从这里开始重复。
Asset Name In Point Description
Zach And Jenv4 00:00:13:11
Zach And Jenv4 00:00:14:54
Zach And Jenv4 00:00:16:37
Zach And Jenv4 00:00:18:20
Zach And Jenv4 00:00:20:03
Zach And Jenv4 00:00:21:45
Zach And Jenv4 00:00:23:28
Zach And Jenv4 00:00:25:11
Zach And Jenv4 00:00:26:54
Zach And Jenv4 00:00:28:36
Zach And Jenv4 00:00:30:20
Zach And Jenv4 00:00:32:03
Zach And Jenv4 00:00:33:45
Zach And Jenv4 00:00:35:28
Zach And Jenv4 00:00:37:11
Zach And Jenv4 00:00:38:54
Zach And Jenv4 00:00:40:37
Zach And Jenv4 00:00:42:20
Zach And Jenv4 00:00:44:03
Zach And Jenv4 00:00:45:44
Zach And Jenv4 00:00:47:28
Zach And Jenv4 00:00:49:11
Zach And Jenv4 00:00:50:54
Here's a likely explanation, but OP should look at f.readlines()
content to be sure.这是一个可能的解释,但 OP 应该查看f.readlines()
内容以确定。
The file has \r\r\n
line termination and the default for open
is to translate \r
, \n
, and \r\n
each to a newline when reading, so \r\r\n
gets translated to \n\n
.该文件具有\r\r\n
行终止,并且open
的默认值是在读取时将\r
、 \n
和\r\n
每个转换为换行符,因此\r\r\n
被转换为\n\n
。 One way to generate a file with these line terminations is to use Python's csv.writer
without the documented newlines=''
parameter when opening the file for writing on a Windows OS:生成具有这些行终止符的文件的一种方法是在打开文件以在 Windows 操作系统上写入时使用没有记录的newlines=''
参数的 Python 的csv.writer
:
import csv
# Create "bad" file
with open('test.csv','w') as f: # should have newline='' as a parameter as well
r = csv.writer(f)
r.writerow(['a','b','c'])
r.writerow([1,2,3])
r.writerow([4,5,6])
# Read file as OP did
with open('test.csv') as f:
data = f.readlines()
print(len(data))
print(data)
Output: Output:
6
['a,b,c\n', '\n', '1,2,3\n', '\n', '4,5,6\n', '\n']
With newline=''
parameter added to the open
:将newline=''
参数添加到open
:
3
['a,b,c\n', '1,2,3\n', '4,5,6\n']
Open the CSV file in Excel, Notepad or Notepad++ and you'll see the same double-newline issue, but dumping from the command line doesn't:在 Excel、Notepad 或 Notepad++ 中打开 CSV 文件,您将看到相同的双换行问题,但从命令行转储不会:
C:\>type test.csv
a,b,c
1,2,3
4,5,6
A hex editor will show the \r\r\n
( 0D 0D 0A
in hexadecimal):十六进制编辑器将显示\r\r\n
( 0D 0D 0A
十六进制):
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.