[英]Using loops to search for names from file1 in file2 and writing to file3
I'm new to Python and kind of pulling my hair out here.我是 Python 的新手,有点想把头发拉出来。 I've tried for several things for a few hours and no luck.我已经尝试了几个小时的几件事,但没有运气。
I think it's fairly simple, hopefully.我认为这很简单,希望如此。 I'm trying to search for names from file1 in file2 by stripping the newline character after being read.我正在尝试通过在读取后剥离换行符来从 file2 中的 file1 中搜索名称。 Then matching.然后匹配。 If found I'm trying to write the whole line from file2 to file3.如果找到我正在尝试将整行从 file2 写入 file3。 If nothing found then write just the name to file3.如果没有找到,则只将名称写入 file3。
File1:文件1:
Abigail
Alexa
Jamie
File2:文件2:
Abigail,infoA,infoB,InfoC
John,infoA,infoB,InfoC
Jamie,infoA,infoB,InfoC
File3:文件3:
Abigail,infoA,infoB,InfoC
Alexa
Jamie,infoA,infoB,InfoC
Test Data file1:测试数据文件1:
abigail阿比盖尔
anderson安德森
jan一月
jane简
jancith詹西斯
larry拉里
bob鲍勃
bobbie鲍比
shirley雪莉
sharon沙龙
Test Data file2:测试数据文件2:
abigail,infoA,infoB,infoC阿比盖尔,信息A,信息B,信息C
anderson,infoA,infoB,infoC安德森,信息A,信息B,信息C
jan,infoA,infoB,infoC一月,信息A,信息B,信息C
jancith,infoA,infoB,infoC詹西斯,infoA,infoB,infoC
larry,infoA,infoB,infoC拉里,信息A,信息B,信息C
bob,infoA,infoB,infoC鲍勃,信息A,信息B,信息C
bobbie,infoA,infoB,infoC鲍比,信息A,信息B,信息C
sharon,infoA,infoB,infoC沙龙,信息A,信息B,信息C
This version worked but only read and wrote the first instance.此版本有效,但仅读取和写入第一个实例。
import re
f1 = open("file1.txt", "r")
f2 = open("file2.txt", "r")
f3 = open("file3.txt", "w")
for nameinfo in f1:
nameinfo = nameinfo.rstrip()
for listinfo in f2:
if re.search(nameinfo, listinfo):
f3.write(listinfo)
else
file3.write(nameinfo)
This version worked but it wrote the name (that had no match) over and over while looping between matches.这个版本有效,但它在匹配之间循环时一遍又一遍地写下名称(没有匹配)。
import re
f1 = open("file1.txt", "r")
f2 = open("file2.txt", "r")
f3 = open("file3.txt", "w")
list2 = file2.readlines()
for nameinfo in file1:
nameinfo = gameInfo.rstrip()
for listinfo in list2:
if re.search(nameinfo, listinfo):
file3.write(listinfo)
else
file3.write(nameinfo)
Is it possible to use simple basic loop commands to achieve the desired results?是否可以使用简单的基本循环命令来达到预期的效果? Help with learning would be greatly appreciated.学习帮助将不胜感激。 I see many examples that look incredibly complex or kind of hard to follow.我看到许多看起来非常复杂或难以理解的例子。 I'm just starting out so simple basic methods would be best in learning the basics.我刚刚开始,所以简单的基本方法最适合学习基础知识。
The reason your second solution keeps writing the unfound name is because it searches each line of file2.txt
looking for a match and adds to file3.txt
each time.您的第二个解决方案不断写入未找到名称的原因是因为它搜索file2.txt
的每一行以查找匹配项并每次都添加到file3.txt
中。
What you can do instead is introduce a new variable to store the value you want to add to file3.txt
and then outside of the loop is when you actually append that value to your file.您可以做的是引入一个新变量来存储您要添加到file3.txt
的值,然后在循环之外是当您实际 append 将该值添加到您的文件时。
Here is a working example:这是一个工作示例:
import re
# note the .read().split('\n') this creates a list with each line as an item in the list
f1 = open("file1.txt", "r").read().split('\n')
f2 = open("file2.txt", "r").read().split('\n')
f3 = open("file3.txt", "w")
for name in f1:
# Edit: don't add aditional new line
if name == '':
continue
f3_text = name
for line in f2:
# if we find a match overwrite the name value in f3_text
# EDIT 2: don't match on partial names
# These are called fstrings if you haven't seen them before
# EDIT 3: using a regex allows us to use the ^ character which means start of line
# That way ron doesn't match with Sharon
if re.search(rf"^{name},", line):
f3_text = line
# at this point f3_text is just the name if we never
# found a match or the entire line if a match was found
f3.write(f3_text + '\n')
Edit:编辑:
The reason for the additional new line is if you look at f1
you will see it is actually 4 lines增加新行的原因是,如果您查看f1
,您会看到它实际上是 4 行
f1 = ['Abigail', 'Alexa', 'Jamie', '']
Meaning the outside for loop is ran 4 times and on the last iteration f3_text = ''
which causes an additional new line is appended.这意味着外部 for 循环运行了 4 次,并且在最后一次迭代f3_text = ''
中附加了一个额外的新行。 I added a check to the for loop to account for this.我在 for 循环中添加了一个检查来解决这个问题。
You can also write it in pure Python without using the regex module (if you don't wanna learn it's minilanguage):您也可以在不使用正则表达式模块的情况下用纯 Python 编写它(如果您不想学习它的迷你语言):
with open("file1.txt", "r") as f:
names = f.readlines()
with open("file2.txt", "r") as f:
lines = f.readlines()
names = [name.strip() for name in names] #strip of all other unwanted characters
with open("file3.txt", "w") as f:
for name in names:
to_write = name + '\n'
for line in lines:
if name in line: #If we find a match rewrite 'to_write' variable adn Break the for loop
to_write = line
break
f.write(to_write)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.