简体   繁体   中英

Replace Single Character in a line of a Text file with Python

I have a text file with all of them currently having the same end character (N), which is being used to identify progress the system makes. I want to change the end character to "Y" in case the program ends via an error or other interruptions so that upon restarting the program will search until a line has the end character "N" and begin working from there. Below is my code as well as a sample from the text file.

UPDATED CODE:

def GeoCode():
    f = open("geocodeLongLat.txt", "a")
    with open("CstoGC.txt",'r') as file:
        print("Geocoding...")
        new_lines = []
        for line in file.readlines():
            check = line.split('~')
            print(check)
            if 'N' in check[-1]:
                geolocator = Nominatim()
                dot_number, entry_name, PHY_STREET,PHY_CITY,PHY_STATE,PHY_ZIP = check[0],check[1],check[2],check[3],check[4],check[5] 
                address = PHY_STREET + " " + PHY_CITY + " " + PHY_STATE + " " + PHY_ZIP
                f.write(dot_number + '\n')
                try:
                    location = geolocator.geocode(address)
                    f.write(dot_number + "," + entry_name + "," + str(location.longitude) + "," + str(location.latitude) + "\n")
                except AttributeError:
                    try:
                        address = PHY_CITY + " " + PHY_STATE + " " + PHY_ZIP
                        location = geolocator.geocode(address)
                        f.write(dot_number + "," + entry_name + "," + str(location.longitude) + "," + str(location.latitude) + "\n")
                    except AttributeError:
                        print("Cannot Geocode")
            check[-1] = check[-1].replace('N','Y')
        new_lines.append('~'.join(check))

    with open('CstoGC.txt','r+') as file: # IMPORTANT to open as 'r+' mode as 'w/w+' will truncate your file!
        for line in new_lines:
            file.writelines(line)        

    f.close()

Output:

2967377~DARIN COLE~22112 TWP RD 209~ALVADA~OH~44802~Y
WAY 64 SUITE 100~EADS~TN~38028~N
384767~MILLER FARMS TRANS LLC~1103 COURT ST~BEDFORD~IA~50833~N
986150~R G S TRUCKING LTD~1765 LOMBARDIE DRIVE~QUESNEL~BC~V2J 4A8~N
1012987~DONALD LARRY KIVETT~4509 LANSBURY RD~GREENSBORO~NC~27406-4509~N
735308~ALZEY EXPRESS INC~2244  SOUTH GREEN STREET~HENDERSON~KY~42420~N
870337~RIES FARMS~1613 255TH AVENUE~EARLVILLE~IA~52057~N
148428~P R MASON & SON LLC~HWY 70 EAST~WILLISTON~NC~28589~N
220940~TEXAS MOVING CO INC~908 N BOWSER RD~RICHARDSON~TX~75081-2869~N
854042~ARMANDO ORTEGA~6590 CHERIMOYA AVENUE~FONTANA~CA~92337~N
940587~DIAMOND A TRUCKING INC~192285 E COUNTY ROAD 55~HARMON~OK~73832~N
1032455~INTEGRITY EXPRESS LLC~380 OLMSTEAD AVENUE~DEPEW~NY~14043~N
889931~DUNSON INC~33 CR 3581~FLORA VISTA~NM~87415~N
143608~LARRY A PETERSON & DONNA M PETERSON~W6359 450TH AVE~ELLSWORTH~WI~54011~N
635528~JAMES E WEBB~3926 GREEN ROAD~SPRINGFIELD~TN~37172~N
805496~WAYNE MLADY~22272 135TH ST~CRESCO~IA~52136~N
704996~SAVINA C MUNIZ~814 W LA QUINTA DR~PHARR~TX~78577~N
893169~BINDEWALD MAINTENANCE INC~213 CAMDEN DR~SLIDELL~LA~70459~N
948130~LOGISTICIZE LTD~861 E PERRY ST~PAULDING~OH~45879~N
438760~SMOOTH OPERATORS INC~W8861 CREEK ROAD~DARIEN~WI~53114~N
518872~A B C RELOCATION SERVICES INC~12 BOCKES ROAD~HUDSON~NH~03051~N
576143~E B D ENTERPRISES INC~29 ROY ROCHE DRIVE~WINNIPEG~MB~R3C 2E6~N
968264~BRIAN REDDEMANN~706 WESTGOR STREET~STORDEN~MN~56174-0220~N
721468~QUALITY LOGISTICS INC~645 LEONARD RD~DUNCAN~SC~29334~N

As you can see I am already keeping track of which line I am at just by using x. Should I use something like file.readlines()?

Sample of text document:

570772~CORPORATE BANK TRANSIT OF KENTUCKY INC~3157 HIGHWAY 64 SUITE 100~EADS~TN~38028~N
384767~MILLER FARMS TRANS LLC~1103 COURT ST~BEDFORD~IA~50833~N
986150~R G S TRUCKING LTD~1765 LOMBARDIE DRIVE~QUESNEL~BC~V2J 4A8~N
1012987~DONALD LARRY KIVETT~4509 LANSBURY RD~GREENSBORO~NC~27406-4509~N
735308~ALZEY EXPRESS INC~2244  SOUTH GREEN STREET~HENDERSON~KY~42420~N
870337~RIES FARMS~1613 255TH AVENUE~EARLVILLE~IA~52057~N
148428~P R MASON & SON LLC~HWY 70 EAST~WILLISTON~NC~28589~N
220940~TEXAS MOVING CO INC~908 N BOWSER RD~RICHARDSON~TX~75081-2869~N
854042~ARMANDO ORTEGA~6590 CHERIMOYA AVENUE~FONTANA~CA~92337~N
940587~DIAMOND A TRUCKING INC~192285 E COUNTY ROAD 55~HARMON~OK~73832~N
1032455~INTEGRITY EXPRESS LLC~380 OLMSTEAD AVENUE~DEPEW~NY~14043~N
889931~DUNSON INC~33 CR 3581~FLORA VISTA~NM~87415~N

Thank you!

Edit: updated code thanks to @idlehands

There are a few ways to do this.

Option #1

My original thought was to use the tell() and seek() method to go back a few steps but it quickly shows that you cannot do this conveniently when you're not opening the file in bytes and definitely not in a for loop of readlines() . You can see the reference threads here:

Is it possible to modify lines in a file in-place?
How to solve "OSError: telling position disabled by next() call"

The investigation led to this piece of code:

with open('file.txt','rb+') as file:
    line = file.readline() # initiate the loop
    while line: # continue while line is not None
        print(line)
        check = line.split(b'~')[-1]
        if check.startswith(b'N'): # carriage return is expected for each line, strip it

            # ... do stuff ... #

            file.seek(-len(check), 1) # place the buffer at the check point
            file.write(check.replace(b'N', b'Y')) # replace "N" with "Y"
        line = file.readline() # read next line

In the first referenced thread one of the answers mentioned this could lead you to potential problems, and directly modifying the bytes on the buffer while reading it is probably considered a bad idea™. A lot of pros probably will scold me for even suggesting it.

Option #2a

(if file size is not horrendously huge)

with open('file.txt','r') as file:
    new_lines = []
    for line in file.readlines():
        check = line.split('~')
        if 'N' in check[-1]:

            # ... do stuff ... #

            check[-1] = check[-1].replace('N','Y')
        new_lines.append('~'.join(check))

with open('file.txt','r+') as file: # IMPORTANT to open as 'r+' mode as 'w/w+' will truncate your file!
    for line in new_lines:
        file.writelines(line)

This approach loads all the lines into memory first, so you do the modification in memory but leave the buffer alone. Then you reload the file and write the lines that were changed. The caveat is that technically you are rewriting the entire file line by line - not just the string N even though it was the only thing changed.

Option #2b

Technically you could open the file as r+ mode from the onset and then after the iterations have completed do this (still within the with block but outside of the loop):

# ... new_lines.append('~'.join(check)) #
    file.seek(0)
    for line in new_lines: 
        file.writelines(line)

I'm not sure what distinguishes this from Option #1 since you're still reading and modifying the file in the same go. If someone more proficient in IO/buffer/memory management wants to chime in please do.

The disadvantage for Option 2a/b is that you always end up storing and rewriting the lines in the file even if you are only left with a few lines that needs to be updated from 'N' to 'Y'.

Results (for all solutions):

 570772~CORPORATE BANK TRANSIT OF KENTUCKY INC~3157 HIGHWAY 64 SUITE 100~EADS~TN~38028~Y 384767~MILLER FARMS TRANS LLC~1103 COURT ST~BEDFORD~IA~50833~Y 986150~RGS TRUCKING LTD~1765 LOMBARDIE DRIVE~QUESNEL~BC~V2J 4A8~Y 1012987~DONALD LARRY KIVETT~4509 LANSBURY RD~GREENSBORO~NC~27406-4509~Y 735308~ALZEY EXPRESS INC~2244 SOUTH GREEN STREET~HENDERSON~KY~42420~Y 870337~RIES FARMS~1613 255TH AVENUE~EARLVILLE~IA~52057~Y 148428~PR MASON & SON LLC~HWY 70 EAST~WILLISTON~NC~28589~Y 220940~TEXAS MOVING CO INC~908 N BOWSER RD~RICHARDSON~TX~75081-2869~Y 854042~ARMANDO ORTEGA~6590 CHERIMOYA AVENUE~FONTANA~CA~92337~Y 940587~DIAMOND A TRUCKING INC~192285 E COUNTY ROAD 55~HARMON~OK~73832~Y 1032455~INTEGRITY EXPRESS LLC~380 OLMSTEAD AVENUE~DEPEW~NY~14043~Y 889931~DUNSON INC~33 CR 3581~FLORA VISTA~NM~87415~Y 

And if you were to say, encountered a break at the line starting with 220940 , the file would become:

 570772~CORPORATE BANK TRANSIT OF KENTUCKY INC~3157 HIGHWAY 64 SUITE 100~EADS~TN~38028~Y 384767~MILLER FARMS TRANS LLC~1103 COURT ST~BEDFORD~IA~50833~Y 986150~RGS TRUCKING LTD~1765 LOMBARDIE DRIVE~QUESNEL~BC~V2J 4A8~Y 1012987~DONALD LARRY KIVETT~4509 LANSBURY RD~GREENSBORO~NC~27406-4509~Y 735308~ALZEY EXPRESS INC~2244 SOUTH GREEN STREET~HENDERSON~KY~42420~Y 870337~RIES FARMS~1613 255TH AVENUE~EARLVILLE~IA~52057~Y 148428~PR MASON & SON LLC~HWY 70 EAST~WILLISTON~NC~28589~Y 220940~TEXAS MOVING CO INC~908 N BOWSER RD~RICHARDSON~TX~75081-2869~N 854042~ARMANDO ORTEGA~6590 CHERIMOYA AVENUE~FONTANA~CA~92337~N 940587~DIAMOND A TRUCKING INC~192285 E COUNTY ROAD 55~HARMON~OK~73832~N 1032455~INTEGRITY EXPRESS LLC~380 OLMSTEAD AVENUE~DEPEW~NY~14043~N 889931~DUNSON INC~33 CR 3581~FLORA VISTA~NM~87415~N 

There are pros and cons to these approaches. Try and see which one fits your use case the best.

I would read the entire input file into a list and .pop() the lines off one at a time. In case of an error, append the popped item to the list and write overwrite the input file. This way it will always be up to date and you won't need any other logic.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM