Python 读取行 function 未读取文件中的第一行

Question

I am trying to search through a list of files and extract the line start with "id'. This occurs for many times in each file and often in the first line of text in the file.我正在尝试搜索文件列表并提取以“id”开头的行。这在每个文件中多次发生，并且通常在文件的第一行文本中发生。

The code I have written so far works, however it seems to miss the first line in each file (the first occurrence of 'id').到目前为止我编写的代码有效，但是它似乎错过了每个文件的第一行（第一次出现“id”）。

for file2 in data_files2:
    with open(file2, 'r') as f:  # use context manager to open files
        for line in f:
            lines = f.readlines()
            a=0

            while a < len(lines):
                temp_array = lines[a].rstrip().split(",")
                if temp_array[0] == "id":
                    game_id = temp_array[1]

Any suggestions on how I can include this first line of text in the readlines?关于如何将第一行文本包含在 readlines 中的任何建议？ I tried changing a to -1 so it would include the first line of text (where a=0) but this didn't work.我尝试将 a 更改为 -1，以便包含第一行文本（其中 a=0），但这不起作用。

EDIT:编辑：

I need to keep 'a' in my code as an index because I use it later on.我需要在我的代码中保留“a”作为索引，因为我稍后会使用它。 The code I showed above was truncated.我上面显示的代码被截断了。 Here is more of the code for example.例如，这里有更多代码。 Any suggestions on how else I can remove "for line in f:"?关于如何删除“for line in f:”的任何建议？

for file2 in data_files2:
    with open(file2, 'r') as f:  # use context manager to open files
        for line in f:
            lines = f.readlines()
            a=0

            while a < len(lines):
                temp_array = lines[a].rstrip().split(",")
                if temp_array[0] == "id":
                    game_id = temp_array[1]


                    for o in range(a+1,a+7,1):
                         if lines[o].rstrip().split(",")[1]== "visteam":
                            awayteam = lines[o].rstrip().split(",")[2]
                         if lines[o].rstrip().split(",")[1]== "hometeam":
                            hometeam = lines[o].rstrip().split(",")[2]
                         if lines[o].rstrip().split(",")[1]== "date":
                            date = lines[o].rstrip().split(",")[2]
                         if lines[o].rstrip().split(",")[1]== "site":
                            site = lines[o].rstrip().split(",")[2]

Answer 1

for file2 in data_files2:
    with open(file2, 'r') as f:  # use context manager to open files
        for line in f:
            temp_array = line.rstrip().split(",")
            if temp_array[0] == "id":
                game_id = temp_array[1]

The above should work, it can also be made a bit faster as there is no need to create a list for each line:以上应该可以工作，它也可以更快一点，因为不需要为每一行创建一个列表：

for file2 in data_files2:
    with open(file2, 'r') as f:  # use context manager to open files
        for line in f:
            if line.startswith("id,"):
                temp_array = line.rstrip().split(",")
                game_id = temp_array[1]

You can use enumerate to keep track of the current line number.您可以使用enumerate来跟踪当前行号。 Here is another way having seen your edit to the question;这是看到您对问题的编辑的另一种方式；

for file2 in data_files2:

    with open(file2, 'r') as f:  # use context manager to open files
        lines = f.readlines()
        for n, line in enumerate(lines):

            if line.startswith("id,"):
                game_id = line.rstrip().split(",")[1]

                for o in range(n + 1, n + 7):

                    linedata = lines[o].rstrip().split(",")
                    spec = linedata[1]

                    if spec == "visteam":
                        awayteam = linedata[2]
                    elif spec == "hometeam":
                        hometeam = linedata[2]
                    elif spec == "date":
                        date = linedata[2]
                    elif spec == "site":
                        site = linedata[2]

You should also consider using the csv library for working with csv files.您还应该考虑使用csv库来处理 csv 文件。

Python 读取行 function 未读取文件中的第一行

问题描述

1 个解决方案

解决方案1
0 2019-10-29 04:28:32

Python 读取行 function 未读取文件中的第一行

问题描述

1 个解决方案

解决方案1 0 2019-10-29 04:28:32

解决方案1
0 2019-10-29 04:28:32