[英]Python readlines function not reading first line in file
I am trying to search through a list of files and extract the line start with "id'. This occurs for many times in each file and often in the first line of text in the file.我正在尝试搜索文件列表并提取以“id”开头的行。这在每个文件中多次发生,并且通常在文件的第一行文本中发生。
The code I have written so far works, however it seems to miss the first line in each file (the first occurrence of 'id').到目前为止我编写的代码有效,但是它似乎错过了每个文件的第一行(第一次出现“id”)。
for file2 in data_files2:
with open(file2, 'r') as f: # use context manager to open files
for line in f:
lines = f.readlines()
a=0
while a < len(lines):
temp_array = lines[a].rstrip().split(",")
if temp_array[0] == "id":
game_id = temp_array[1]
Any suggestions on how I can include this first line of text in the readlines?关于如何将第一行文本包含在 readlines 中的任何建议? I tried changing a to -1 so it would include the first line of text (where a=0) but this didn't work.
我尝试将 a 更改为 -1,以便包含第一行文本(其中 a=0),但这不起作用。
EDIT:编辑:
I need to keep 'a' in my code as an index because I use it later on.我需要在我的代码中保留“a”作为索引,因为我稍后会使用它。 The code I showed above was truncated.
我上面显示的代码被截断了。 Here is more of the code for example.
例如,这里有更多代码。 Any suggestions on how else I can remove "for line in f:"?
关于如何删除“for line in f:”的任何建议?
for file2 in data_files2:
with open(file2, 'r') as f: # use context manager to open files
for line in f:
lines = f.readlines()
a=0
while a < len(lines):
temp_array = lines[a].rstrip().split(",")
if temp_array[0] == "id":
game_id = temp_array[1]
for o in range(a+1,a+7,1):
if lines[o].rstrip().split(",")[1]== "visteam":
awayteam = lines[o].rstrip().split(",")[2]
if lines[o].rstrip().split(",")[1]== "hometeam":
hometeam = lines[o].rstrip().split(",")[2]
if lines[o].rstrip().split(",")[1]== "date":
date = lines[o].rstrip().split(",")[2]
if lines[o].rstrip().split(",")[1]== "site":
site = lines[o].rstrip().split(",")[2]
for file2 in data_files2:
with open(file2, 'r') as f: # use context manager to open files
for line in f:
temp_array = line.rstrip().split(",")
if temp_array[0] == "id":
game_id = temp_array[1]
The above should work, it can also be made a bit faster as there is no need to create a list for each line:以上应该可以工作,它也可以更快一点,因为不需要为每一行创建一个列表:
for file2 in data_files2:
with open(file2, 'r') as f: # use context manager to open files
for line in f:
if line.startswith("id,"):
temp_array = line.rstrip().split(",")
game_id = temp_array[1]
You can use enumerate
to keep track of the current line number.您可以使用
enumerate
来跟踪当前行号。 Here is another way having seen your edit to the question;这是看到您对问题的编辑的另一种方式;
for file2 in data_files2:
with open(file2, 'r') as f: # use context manager to open files
lines = f.readlines()
for n, line in enumerate(lines):
if line.startswith("id,"):
game_id = line.rstrip().split(",")[1]
for o in range(n + 1, n + 7):
linedata = lines[o].rstrip().split(",")
spec = linedata[1]
if spec == "visteam":
awayteam = linedata[2]
elif spec == "hometeam":
hometeam = linedata[2]
elif spec == "date":
date = linedata[2]
elif spec == "site":
site = linedata[2]
You should also consider using the csv library for working with csv files.您还应该考虑使用csv库来处理 csv 文件。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.