简体   繁体   English

Pandas 在合并多个 CSV 文件后向输出文件添加标题

[英]Pandas adding header to the output file after merging multiple CSV files

import pandas as pd
import os

file1 = 'https://public.fyers.in/sym_details/NSE_CM.csv'
file2 = 'https://public.fyers.in/sym_details/NSE_FO.csv'
file3 = 'https://public.fyers.in/sym_details/BSE_CM.csv'
CHUNK_SIZE = 10 ** 6
csv_file_list = [file1, file2, file3]
output_file = "/content/output.csv"

for csv_file_name in csv_file_list:
  skipRows = [2022,92805]
  chunk_container = pd.read_csv(csv_file_name, chunksize=CHUNK_SIZE, skiprows=skipRows)
  for chunk in chunk_container:
    headerList =["fytoken", "symbol", "instrumentType","lotSize","tickSize","ISIN","tradingSession","lastUpdate","expiryDate","symbolTicker","exchange","segment","scripCode","scripName","scripToken","strikePrice","optionType"]
    chunk.to_csv(output_file,header=headerList, mode="a", index=False)

I want to merge the three CSV files and add header to the output file.我想合并三个 CSV 文件并将标题添加到输出文件。 But it's returning output file at with header at start of each CSV (in the output file).但它在每个 CSV 的开头(在输出文件中)返回带有标题的输出文件。

You are reading the content in chunks and appending the header for each chunk.您正在阅读块中的内容并为每个块附加header

Instead, try below:相反,请尝试以下操作:

import pandas as pd

file1 = 'https://public.fyers.in/sym_details/NSE_CM.csv'
file2 = 'https://public.fyers.in/sym_details/NSE_FO.csv'
file3 = 'https://public.fyers.in/sym_details/BSE_CM.csv'
CHUNK_SIZE = 10 ** 6
csv_file_list = [file1, file2, file3]
output_file = "./content/output.csv"

headerList = ["fytoken", "symbol", "instrumentType", "lotSize", "tickSize", "ISIN", "tradingSession",
              "lastUpdate", "expiryDate", "symbolTicker", "exchange", "segment", "scripCode", "scripName",
              "scripToken", "strikePrice", "optionType"]

df = pd.DataFrame(columns=headerList)
df.to_csv(output_file, index=False)

for csv_file_name in csv_file_list:
    skipRows = [2022, 92805]
    with pd.read_csv(csv_file_name, chunksize=CHUNK_SIZE, skiprows=skipRows) as chunk_container:
        for chunk in chunk_container:
            chunk.to_csv(output_file, header=None, mode="a", index=False)

Here we're creating a csv file with only headers beforehand and appending the data reading from above URLs to the same file.在这里,我们预先创建了一个只有headers的 csv 文件,并将从上述 URL 读取的数据附加到同一个文件中。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM