简体   繁体   中英

Delete specific characters from only two columns in a CSV file using Python 3.5

I am trying to edit a CSV file containing 4 million rows of data with 19 columns. There are two columns (the third and fourth) which list names of individuals, and the way their names are listed are "LastName, FirstName."

C00431445,"P80003338","Obama, Barack","DUCLOS, DUNCAN","CHICAGO","IL","606601303","OBAMA FOR AMERICA","ACCOUNTING MANAGER",77.65,08-AUG-08,"","","","SA17A","753821","5433431","P2008",

This is problematic because when I try to upload this file into MySQL using a delimiter of commas, it splits these 2 columns' names in half. I want to use Python 3.5 to select these two columns and remove the commas from inside them only, without deleting the commas in the other rows.

I am somewhat of a novice when it comes to coding and any help is appreciated. I know it's possible to split these columns using .split() and then merging them sans commas, however I wanted a cleaner method which would remove the commas directly from this file.

Use csv module to read and write

import csv

f = open('file.csv', 'rb')
reader = csv.reader(f)
your_list = list(reader)


f = open('file.csv', 'wb')
writer = csv.writer(f, delimiter=',', quotechar='"', quoting=csv.QUOTE_ALL)

for line in your_list:
   line[2] = line[2].replace(",","")
   line[3] = line[3].replace(",","")
   writer.writerow(line)

In MySQL, read the columns into @variables, then manipulate them as you store them into the actual columns:

LOAD DATA ...
    (id1, id2, @name1, @name2, ...),
    SET name1 = REPLACE(@name1, ',', ''),
        name2 = REPLACE(@name2, ',', '');

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM