简体   繁体   中英

Printing columns from a CSV file into an excel file with python

I am trying to come up with a script that will allow me to read all csv files with greater than 62 bits and print two columns into a separate excel file and create a list.

The following is one of the csv files:

FileUUID        Table   RowInJSON       JSONVariable    Error   Notes   SQLExecuted
ff3ca629-2e9c-45f7-85f1-a3dfc637dd81    lng02_rpt_b_calvedets   1               Duplicate entry 'ETH0007805440544' for key 'nosameanimalid'             INSERT INTO lng02_rpt_b_calvedets(farmermobile,hh_id,rpt_b_calvedets_rowid,damidyesno,damid,calfdam_id,damtagid,calvdatealv,calvtype,calvtypeoth,easecalv,easecalvoth,birthtyp,sex,siretype,aiprov,othaiprov,strawidyesno,strawid)  VALUES ('0974502779','1','1','0','ETH0007805440544','ETH0007805470547',NULL,'2017-09-16','1',NULL,'1',NULL,'1','2','1',NULL,NULL,NULL,NULL,NULL,'0',NULL,NULL,NULL,NULL,NULL,NULL,'0',NULL,'Tv',NULL,NULL,'Et','23',NULL,'5',NULL,NULL,NULL,'0','0')

This is my attempt to solving this problem:

path = 'csvs/'
for infile in glob.glob( os.path.join(path, '*csv') ):
    output = infile + '.out'
    with open(infile, 'r') as source:
        readr = csv.reader(source)
        with open(output,"w") as result:
            writr = csv.writer(result)
            for r in readr:
                writr.writerow((r[4], r[2]))

Please help point me to the right direction with any alternative solution

pandas does a lot of what you are trying to achieve:

import pandas as pd

# Read a csv file to a dataframe
df = pd.read_csv("<path-to-csv>")

# Filter two columns
columns = ["FileUUID", "Table"]
df = df[columns]

# Combine multiple dataframes
df_combined = pd.concat([df1, df2, df3, ...])

# Output dataframe to excel file
df_combined.to_excel("<output-path>", index=False)

To loop through all csv files > 62bits, you can use glob.glob() and os.stat()

import os
import glob

dataframes = []

for csvfile in glob.glob("<csv-folder-path>/*.csv"):
  if os.stat(csvfile).st_size > 62:
    dataframes.append(pd.read_csv(csvfile))

Use the standard csv module. Don't re-invent the wheel.

https://docs.python.org/3/library/csv.html

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM