简体   繁体   中英

Using loops to search for names from file1 in file2 and writing to file3

I'm new to Python and kind of pulling my hair out here. I've tried for several things for a few hours and no luck.

I think it's fairly simple, hopefully. I'm trying to search for names from file1 in file2 by stripping the newline character after being read. Then matching. If found I'm trying to write the whole line from file2 to file3. If nothing found then write just the name to file3.

File1:

Abigail
Alexa
Jamie

File2:

Abigail,infoA,infoB,InfoC
John,infoA,infoB,InfoC
Jamie,infoA,infoB,InfoC

File3:

Abigail,infoA,infoB,InfoC
Alexa
Jamie,infoA,infoB,InfoC

Test Data file1:

abigail
anderson
jan
jane
jancith
larry
bob
bobbie
shirley
sharon

Test Data file2:

abigail,infoA,infoB,infoC
anderson,infoA,infoB,infoC
jan,infoA,infoB,infoC
jancith,infoA,infoB,infoC
larry,infoA,infoB,infoC
bob,infoA,infoB,infoC
bobbie,infoA,infoB,infoC
sharon,infoA,infoB,infoC

This version worked but only read and wrote the first instance.

import re

f1 = open("file1.txt", "r")
f2 = open("file2.txt", "r")
f3 = open("file3.txt", "w")

for nameinfo in f1:
    nameinfo = nameinfo.rstrip()

    for listinfo in f2:
        if re.search(nameinfo, listinfo):
            f3.write(listinfo)
        else
            file3.write(nameinfo)

This version worked but it wrote the name (that had no match) over and over while looping between matches.

import re

f1 = open("file1.txt", "r")
f2 = open("file2.txt", "r")
f3 = open("file3.txt", "w")

list2 = file2.readlines()

for nameinfo in file1:
    nameinfo = gameInfo.rstrip()

    for listinfo in list2:
        if re.search(nameinfo, listinfo):
            file3.write(listinfo)
        else
            file3.write(nameinfo)

Is it possible to use simple basic loop commands to achieve the desired results? Help with learning would be greatly appreciated. I see many examples that look incredibly complex or kind of hard to follow. I'm just starting out so simple basic methods would be best in learning the basics.

The reason your second solution keeps writing the unfound name is because it searches each line of file2.txt looking for a match and adds to file3.txt each time.

What you can do instead is introduce a new variable to store the value you want to add to file3.txt and then outside of the loop is when you actually append that value to your file.

Here is a working example:

import re

# note the .read().split('\n') this creates a list with each line as an item in the list
f1 = open("file1.txt", "r").read().split('\n')
f2 = open("file2.txt", "r").read().split('\n')
f3 = open("file3.txt", "w")

for name in f1:
    # Edit: don't add aditional new line
    if name == '':
        continue

    f3_text = name

    for line in f2:
        # if we find a match overwrite the name value in f3_text
        # EDIT 2: don't match on partial names
        # These are called fstrings if you haven't seen them before
        # EDIT 3: using a regex allows us to use the ^ character which means start of line 
        # That way ron doesn't match with Sharon
        if re.search(rf"^{name},", line):
            f3_text = line

    # at this point f3_text is just the name if we never 
    # found a match or the entire line if a match was found
    f3.write(f3_text + '\n')

Edit:

The reason for the additional new line is if you look at f1 you will see it is actually 4 lines

f1 = ['Abigail', 'Alexa', 'Jamie', '']

Meaning the outside for loop is ran 4 times and on the last iteration f3_text = '' which causes an additional new line is appended. I added a check to the for loop to account for this.

You can also write it in pure Python without using the regex module (if you don't wanna learn it's minilanguage):

with open("file1.txt", "r") as f:
    names = f.readlines()

with open("file2.txt", "r") as f:
    lines = f.readlines()

names = [name.strip() for name in names] #strip of all other unwanted characters

with open("file3.txt", "w") as f:
    for name in names:
        to_write = name + '\n'

        for line in lines:
            if name in line: #If we find a match rewrite 'to_write' variable adn Break the for loop
                to_write = line
                break

        f.write(to_write)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM