简体   繁体   中英

How to copy CSV data to an existing xlsx file using Python

I am using Python 3.4 and I am writing a short script. I have an executable that take the output of a software and outputs a bunch of csv files with the requested data. I have created an Excel (xlsx) file as a template. It has a chart tailored to specific requirements.

I want to take a certain range of that data in each of those csv files and then input them into the existing excel template I have already created and save it with a unique file name. Essentially iterate this process.

Each csv file will have a unique name. My goal is to help automate creating graphs. Sometimes this can end up being 100s of graphs. I have searched a lot on how to do this with little help.

Again I would initiate the script and it would run through each csv file (whether there is 5 or 500) and then copy the data (a certain range which is always in the same cells) then paste it into the template xlsx file I have created and save it with a similar name to the csv except it will have .xlsx as the extension.

I do not know if this is the best approach or if I should create a csv template instead that it will copy to.

Any help is much appreciated, thank you.

1st approach

If your end goal is to generate graphs from data available in csv then you can use csvReader to read data and matplotlib to plot graphs.

Simple example:

Sample csv file:
1,10,45
2,20,30
3,30,90
4,40,80

import csv
import matplotlib.pyplot as plt
from matplotlib.backends.backend_pdf import PdfPages

keys = ['Col1','Col2','Col3']
col1 = []
col2 = []
col3 = []
fd = open('sample.csv','r')
reader = csv.DictReader(fd,fieldnames=keys)
for row in reader:
    col1.append(int(row['Col1']))
    col2.append(int(row['Col2']))
    col3.append(int(row['Col3']))

pp = PdfPages("Sample.pdf")

plt.title("Col1 Vs Col2")
plt.xlabel("X-Values")
plt.ylabel("Y-Values")
plt.plot(col1,col2,label="Label 1",marker = "*")
legend = plt.legend(loc='best', shadow=True, fontsize=6)
legend.get_frame().set_facecolor('#00FFCC')
plt.grid(True)
plt.savefig(pp,format='pdf')
plt.clf()

plt.title("Col1 Vs Col3")
plt.xlabel("X-Values")
plt.ylabel("Y-Values")
plt.plot(col1,col3,label="Lable 2",marker = "*")
legend = plt.legend(loc='best', shadow=True, fontsize=6)
legend.get_frame().set_facecolor('#00FFCC')
plt.grid(True)
plt.savefig(pp,format='pdf')
plt.clf()
pp.close()

References:

2nd approach

You can use xlrd,xlwt and xlutils to perform operation on excel files

Read data using csvReader, copy your existing template using xlutils, edit that and again save back

Reference:

You can get all files by using the glob module:

   import glob
   csv_file_list = glob.glob('*.csv')
   for fyle in csv_file_list:
       data = read_csv(fyle)
       write_to_excel(data)

Your read_csv() function should accept a CSV file and create an array of arrays (matrix) with the data. It's common to use the csv module (which in Python 3 doesn't require the unicodecsv 'add-on').

If the data is just numbers and you don't have to worry about quoted fields, then it's much faster to just read the rows. So your read_csv() function would look like this:

with open(fyle,'rb') as input:
    data = input.readlines().split(delim)
return data

Then your write_to_excel() function would accept 'data' and write to your template.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM