简体   繁体   English

通过比较文件2的所有行与文件1的每一行来读取两个文件

[英]reading two files by comparing all lines of file 2 with each line of file 1

I'm trying to read two files and comparing two columns with dates in them and if the dates are the same, then I want to compare two values corresponding to the dates. 我正在尝试读取两个文件,并将两个列中的日期进行比较,如果日期相同,那么我想比较两个与日期相对应的值。 I want to read one line of file 1 with all the lines of file 2 and then the next line of line 1 with all the lines of file 2. However, when I try to compare the dates, my for loop that reads the two files only runs once. 我想读取文件1的一行和文件2的所有行,然后读取行1的下一行和文件2的所有行。但是,当我尝试比较日期时,我的for循环读取了两个文件只运行一次。 How do I make it so that I can compare file 1 and file 2 as i said earlier? 我如何做才能像我之前所说的比较文件1和文件2?

with open('file1.txt') as f1:
with open('file2.txt') as f2:
    for i in (f1):
            column1f1 = (i.split()[0])
            column2f1 = (i.split()[1])
            for j in (f2):
                    column1f2 = (j.split()[0])
                    column2f2 = (j.split()[1])
                    print(column1f1)
                    print(column1f2)

I expected this to give me the entirety of file 2 with the first line of file 1, and then repeated for all the lines of file 1, but instead it only runs for the first line of file 1 and then stops. 我希望这能给我完整的文件2和文件1的第一行,然后对文件1的所有行重复一次,但是相反,它只在文件1的第一行运行,然后停止。

What happens is that, when python is iterating over the second file it changes the position of the "cursor" and in the end of the iteration, the cursor location is at the end of the file. 发生的是,当python遍历第二个文件时,它会更改“光标”的位置,并且在迭代结束时,光标位置位于文件的末尾。 So, once you try to go over the file in the second iteration - it immediately terminates (reaches 'StopIteration') as the "cursor" is already at the end of the file. 因此,一旦您尝试在第二次迭代中遍历文件,由于“光标”已经在文件末尾,它会立即终止(到达“ StopIteration”)。

In the end of the inner loop, you need to return the file current position (cursor for that matter) to the beginning of the file. 在内部循环的末尾,您需要将文件的当前位置(与此相关的光标)返回到文件的开头。

So, that will be: 因此,将是:

date_location = 0
numeric_value_location = 1
with open('file1.txt') as f1:
with open('file2.txt') as f2:
    for i in f1:
            f1_date = (i.split()[date_location])
            f1_numeric = (i.split()[numeric_value_location])
            for j in f2:
                f2_date = (j.split()[date_location])
                f2_numeric = (j.split()[numeric_value_location])
                if f1_date == f2_date:
                    if f2_numeric < f1_numeric:
                        # Do Something

            f2.seek(0, 0)

I changed the code, hopefully as you requested. 我希望您能按要求更改代码。 Please note: 请注意:

  1. The split operation can be improved to one line by doing: 通过执行以下操作,可以将split操作改进为一行:

     f1_date, f1_number = i.split() 
  2. The date comparison as I have added per comment request WILL BREAK at some point. 我为每个评论请求添加的日期比较有时会中断。 The right way to do it, is to format the string date into a datetime object and then do the comparison. 正确的方法是将字符串date格式化为datetime对象,然后进行比较。

  3. See that i have replaced location 0, 1 indexes with variable to give the code some more meaning - try to use this practice in the future. 看到我已用变量替换了位置0、1的索引,以使代码具有更多含义-将来尝试使用此做法。

Hopefully, that's what you have requested. 希望这就是您所要求的。 I highly recommend that you will go over a quick python tutorial just to give yourself a jump-start. 我强烈建议您阅读快速的python教程,只是为了快速入门。 Good luck. 祝好运。

See this post for more details: seek() function? 有关更多详细信息,请参见此帖子: seek()函数?

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM