简体   繁体   中英

Python- Modifying a csv file

Now I know it's usually not feasible to modify a csv file as you are reading from it so you need to create a new csv file and write to it. The problem I'm having is preserving the original order of the data.

The input csv file looks like follows:

C1       C2         C3
apple    BANANA     Mango
pear     PineApple  StRaWbeRRy

I want to turn all the data into lower case and output a new csv file that looks like:

C1       C2         C3
apple    banana     mango
pear     pineapple  strawberry

So far I can iterate through the input csv file and turn all the values into lower case but I don't know how to rewrite it back into a csv file in that format. The code I have is:

def clean (input)
  aList = []
  file = open(input, "r")
  reader = csv.reader(file, delimiter = ',')
  next(reader, None) # Skip the header but I want to preserve it in the output csv file
  for row in reader:
     for col in row:
        aList.append(col.lower())

So now I have a list with all the lowercase data, how do I rewrite it back into a csv file of the same format (same number of rows and columns) as the input including the header row that I skipped in the code.

Pandas way:

Read the file using pandas and get the dataframe. Then you can simply use lower()

import pandas as pd

def conversion(text):
    return text.lower()
    

df = pd.read_csv(file_path)
df[column_name] = df[column_name].map(conversion)

Or even a single liner:

df[column_name] = df[column_name].apply(lambda x: x.lower()) # If you have nan or other non-string values, you may need to convert x to string first like str(x).lower()

Then you can save it using to_csv function

If all you want to do is change the case of the data and preserve everything else you might be best to skip the csv module and just use a straight file eg:

# Open both files
with open("infile.csv") as f_in, open("outfile.csv", 'w') as f_out:
    # Write header unchanged
    header = f_in.readline()
    f_out.write(header)

    # Transform the rest of the lines
    for line in f_in:
        f_out.write(line.lower())

If you want to use csv module for all then use following code snippet.

import os
import csv


def clean(input):
    tmpFile = "tmp.csv"
    with open(input, "r") as file, open(tmpFile, "w") as outFile:
        reader = csv.reader(file, delimiter=',')
        writer = csv.writer(outFile, delimiter=',')
        header = next(reader)
        writer.writerow(header)
        for row in reader:
            colValues = []
            for col in row:
                colValues.append(col.lower())
            writer.writerow(colValues)
    os.rename(tmpFile, input)

the easiest way that i found is as follows let the initial CSV file name be test.csv

with open('test.csv','r') as f:
    with open('cleaned.csv','w') as ff:
        ff.write(f.readline())
        ff.write(f.read().lower())

the above code will create a new csv with all lower case

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM