[英]Pandas adding header to the output file after merging multiple CSV files
import pandas as pd
import os
file1 = 'https://public.fyers.in/sym_details/NSE_CM.csv'
file2 = 'https://public.fyers.in/sym_details/NSE_FO.csv'
file3 = 'https://public.fyers.in/sym_details/BSE_CM.csv'
CHUNK_SIZE = 10 ** 6
csv_file_list = [file1, file2, file3]
output_file = "/content/output.csv"
for csv_file_name in csv_file_list:
skipRows = [2022,92805]
chunk_container = pd.read_csv(csv_file_name, chunksize=CHUNK_SIZE, skiprows=skipRows)
for chunk in chunk_container:
headerList =["fytoken", "symbol", "instrumentType","lotSize","tickSize","ISIN","tradingSession","lastUpdate","expiryDate","symbolTicker","exchange","segment","scripCode","scripName","scripToken","strikePrice","optionType"]
chunk.to_csv(output_file,header=headerList, mode="a", index=False)
我想合并三个 CSV 文件并将标题添加到输出文件。 但它在每个 CSV 的开头(在输出文件中)返回带有标题的输出文件。
您正在阅读块中的内容并为每个块附加header
。
相反,请尝试以下操作:
import pandas as pd
file1 = 'https://public.fyers.in/sym_details/NSE_CM.csv'
file2 = 'https://public.fyers.in/sym_details/NSE_FO.csv'
file3 = 'https://public.fyers.in/sym_details/BSE_CM.csv'
CHUNK_SIZE = 10 ** 6
csv_file_list = [file1, file2, file3]
output_file = "./content/output.csv"
headerList = ["fytoken", "symbol", "instrumentType", "lotSize", "tickSize", "ISIN", "tradingSession",
"lastUpdate", "expiryDate", "symbolTicker", "exchange", "segment", "scripCode", "scripName",
"scripToken", "strikePrice", "optionType"]
df = pd.DataFrame(columns=headerList)
df.to_csv(output_file, index=False)
for csv_file_name in csv_file_list:
skipRows = [2022, 92805]
with pd.read_csv(csv_file_name, chunksize=CHUNK_SIZE, skiprows=skipRows) as chunk_container:
for chunk in chunk_container:
chunk.to_csv(output_file, header=None, mode="a", index=False)
在这里,我们预先创建了一个只有headers
的 csv 文件,并将从上述 URL 读取的数据附加到同一个文件中。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.