简体   繁体   中英

how to identify and replace words in a csv file in python

I have two CSV files, one contains sentences with abbreviations, the other one is a list of abbreviations and their expansion. I want to identify each abbreviation in the first CSV file and replace it with its expansion. This is how this CSV files look:

sample of first file:

vp academic

vp finance and administration

vp academic and student affairs

vp corporate services and external relat. ....

sample of second file:

elect'l. : electrical

vp. : vice president

...

this is my code:

import csv
with open('firstFile.csv', 'rb') as sentence, open('secondFile.csv', 'rb')
as word,open('new.csv', 'wb') as out:   
reader = csv.reader(sentence)
reader2 = csv.reader(word)
abbr_list = list(reader2) 
filewriter = csv.writer(out, delimiter=' ') 

result = ''
for row in reader:
    for i in range (0,1453):
        temp = abbr_list[i][0]
        temp1 = abbr_list[i][1]
        if temp in row[0]:
            result = row[0].replace(temp,temp1)
            row[0] = result

    filewriter.writerow(row)

however, the result I get is not what I was expecting:

result file:

vice president academic

vice president financiale and administrategytegyyion

vice president academic and student affairs

vice president corporate services and executivecutiveternal relatin

Can someone help me to correct my code?

Your string replacement ( row[0].replace ) is not checking whether it matches an entire word. Thus, it's matching 'strat' and turning 'administration' into 'administrategyion', then changing it again into 'administrategyegyion' with the next replacement, etc.

You can either switch to the re module to use regular expressions for string replacement, or you can use spaces as part of the match (eg row[0].replace(' '+temp+' ',' '+temp1+' ') ) - but be aware that the spaces approach will fail if the match is at the start or end of the string.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM