简体   繁体   English

使用 python 将 CSV 文件中的列打印到 excel 文件中

[英]Printing columns from a CSV file into an excel file with python

I am trying to come up with a script that will allow me to read all csv files with greater than 62 bits and print two columns into a separate excel file and create a list.我正在尝试编写一个脚本,该脚本将允许我读取所有大于 62 位的 csv 文件并将两列打印到单独的 excel 文件中并创建一个列表。

The following is one of the csv files:以下是csv文件之一:

FileUUID        Table   RowInJSON       JSONVariable    Error   Notes   SQLExecuted
ff3ca629-2e9c-45f7-85f1-a3dfc637dd81    lng02_rpt_b_calvedets   1               Duplicate entry 'ETH0007805440544' for key 'nosameanimalid'             INSERT INTO lng02_rpt_b_calvedets(farmermobile,hh_id,rpt_b_calvedets_rowid,damidyesno,damid,calfdam_id,damtagid,calvdatealv,calvtype,calvtypeoth,easecalv,easecalvoth,birthtyp,sex,siretype,aiprov,othaiprov,strawidyesno,strawid)  VALUES ('0974502779','1','1','0','ETH0007805440544','ETH0007805470547',NULL,'2017-09-16','1',NULL,'1',NULL,'1','2','1',NULL,NULL,NULL,NULL,NULL,'0',NULL,NULL,NULL,NULL,NULL,NULL,'0',NULL,'Tv',NULL,NULL,'Et','23',NULL,'5',NULL,NULL,NULL,'0','0')

This is my attempt to solving this problem:这是我解决这个问题的尝试:

path = 'csvs/'
for infile in glob.glob( os.path.join(path, '*csv') ):
    output = infile + '.out'
    with open(infile, 'r') as source:
        readr = csv.reader(source)
        with open(output,"w") as result:
            writr = csv.writer(result)
            for r in readr:
                writr.writerow((r[4], r[2]))

Please help point me to the right direction with any alternative solution请通过任何替代解决方案帮助我指出正确的方向

pandas does a lot of what you are trying to achieve: pandas做了很多你想要实现的事情:

import pandas as pd

# Read a csv file to a dataframe
df = pd.read_csv("<path-to-csv>")

# Filter two columns
columns = ["FileUUID", "Table"]
df = df[columns]

# Combine multiple dataframes
df_combined = pd.concat([df1, df2, df3, ...])

# Output dataframe to excel file
df_combined.to_excel("<output-path>", index=False)

To loop through all csv files > 62bits, you can use glob.glob() and os.stat()要遍历所有大于 62 位的 csv 文件,可以使用glob.glob()os.stat()

import os
import glob

dataframes = []

for csvfile in glob.glob("<csv-folder-path>/*.csv"):
  if os.stat(csvfile).st_size > 62:
    dataframes.append(pd.read_csv(csvfile))

Use the standard csv module.使用标准的 csv 模块。 Don't re-invent the wheel.不要重新发明轮子。

https://docs.python.org/3/library/csv.html https://docs.python.org/3/library/csv.html

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM