[英]Search for content segments of one .txt file in another .txt file digit by digit and print matching lines
I have two txt files: file1.txt and file2.txt.我有两个 txt 文件:file1.txt 和 file2.txt。
File1.txt contains the following: File1.txt 包含以下内容:
12345678
File2.txt contains this: File2.txt 包含以下内容:
34567999
23499899
13571234
I now want to look at the first 3 digits of line 1 of file1.txt (which are "123").我现在想查看 file1.txt 第 1 行的前 3 位数字(即“123”)。 I now want to go to file2.txt and search for these three digits ("123").
我现在想去 file2.txt 并搜索这三个数字(“123”)。 When I find these digits in that order in a line, (ie: this would be the case in line 3: 1357 123 4), I want to write this line to a new file: file_new.txt.
当我在一行中按该顺序找到这些数字时(即:第 3 行就是这种情况:1357 123 4),我想将此行写入一个新文件:file_new.txt。
Then, if all lines in file2.txt have been searched for this sequence from file1.txt ("123"), I want to move one digit further in file1.txt, so that the new search query is "234".然后,如果在 file2.txt 中的所有行都从 file1.txt(“123”)中搜索了这个序列,我想在 file1.txt 中再移动一位,以便新的搜索查询是“234”。 Now, I want to go to file2.txt again to search for all sequences with "234" in the, (ie: line 2 ( 234 99899) and line 3 (13571 234 )).
现在,我想再次转到 file2.txt 以搜索所有包含“234”的序列,(即:第 2 行( 234 99899)和第 3 行(13571 234 ))。 As line 3 is already contained in file_new.txt, I only want to write line 2 to file_new.txt.
由于第 3 行已包含在 file_new.txt 中,我只想将第 2 行写入 file_new.txt。
I want to continue this process, searching for the next three digits until the whole line in file1.txt has been search for in file2.txt.我想继续这个过程,搜索接下来的三位数字,直到在 file2.txt 中搜索到 file1.txt 中的整行。
Could someone please help me tackle this problem?有人可以帮我解决这个问题吗?
You can use readlines to read text file into list and then generate a new list L using a while loop as below.您可以使用 readlines 将文本文件读入列表,然后使用如下所示的 while 循环生成新列表 L。 You can then write this list L to a text file.
然后,您可以将此列表 L 写入文本文件。
with open(file1_path) as file1:
search_string = file1.readlines()[0]
with open(file2_path) as file2:
strings_to_search = file2.readlines()
L= []
n=0
while n < len(search_string):
for i in strings_to_search:
if search_string[n:n+3] in i and i not in L:
L.append(i)
n +=1
I got a little solution here :我在这里得到了一个小解决方案:
f1 = open('file1.txt', 'r') # open in read mode
for digit in range(len(f1.readlines()[0])-2):
threedigits = f1.readlines()[0][digit:digit+3] # This is the first three digits
f2 = open('file2.txt', 'r') # open in read mode
lines = f2.readlines() # we read all lines
f2.close()
file_new = []
for i in lines:
if firstthreedigits in i:
file_new.append(i) # we add each lines containing the first three digits
f3 = open('file_new.txt', 'w') # open in write mode
for i in range(len(file_new)):
f3.write(file_new[i]) # we write all lines with first three digits
f3.close()
f1.close()
This should to it这应该给它
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.