How to merge many CSV files?

Question

I have about 7500 csv that needs to be merged into a single one in order to create an easy-readable table. The files format are as follow:

each file is denominated as a ticker of a stock (example: AA.csv, AAL.csv, AAPL.csv,etc...)
every file itself contains a date and a number in this format
```
2018-10-11,1 2018-10-12,3 2018-10-15,2...
```

Now I want to merge them into a single csv file where the resulting table has in the header the name of the tickers, in the first column the dates and following the numbers (obviously keeping the csv format).

Example:

Note that some csv files are empty and some have differents starting dates or dates are just discontinued

Answer 1

You could do something like this:

import pandas as pd
import numpy as np
from glob import glob

dfs_list = []
for csv_file in glob('Tickers List/*.csv'):
    stock_ticker = csv_file.split(".")[0]
    df = pd.read_csv(csv_file,header=None, names=["date","num"])
    if df.shape[0] >0:
        df["date"] = pd.to_datetime(df["date"],format="%Y-%m-%d")
        df["stock_ticker"] = stock_ticker
        dfs_list.append(df)

final_df = pd.concat(dfs_list)

With glob('dir/*.csv') we get all the csv files on a folder.

After this, you will have a DataFrame that looks like this:

If you want to change to your format, you can do this: (note that the dates are automatically sorted because is used as index).

final_df = pd.pivot_table(final_df, values='num', index=['date'],
               columns=['stock_ticker'], fill_value=np.nan)

And you will have a DataFrame that looks like this:

Now you can write this DataFrame to a new csv with:

final_df.to_csv("merged.csv")

FULL CODE

import pandas as pd
import numpy as np
from glob import glob

dfs_list = []
for csv_file in glob('Tickers List/*.csv'):
    stock_ticker = csv_file.split(".")[0]
    df = pd.read_csv(csv_file,header=None, names=["date","num"])
    if df.shape[0] >0:
        df["date"] = pd.to_datetime(df["date"],format="%Y-%m-%d")
        df["stock_ticker"] = stock_ticker
        dfs_list.append(df)

final_df = pd.concat(dfs_list)

final_df = pd.pivot_table(final_df, values='num', index=['date'],
               columns=['stock_ticker'], fill_value=np.nan)

final_df.to_csv("merged.csv")

Answer 2

Since you have non-uniform csvs, and some specific requirements:

certain format to output the csv
some csv files are empty
some csv files have different start dates
some csv files have discontinues dates

I'd do something like this with the `csv` module:

import os,csv
csv_dir = "C:\\Users\Diego\\PycharmProjects\\thesis\\Tickers List\\"
csv_out = 'csvout.csv'

all_dates = []
fin_table = dict()

# iterating over all csv files
for csv_file in os.listdir(csv_dir):
    if True:

        csv_name = csv_file #[:-4] # excluding the ".csv" extension 
        table = dict()
        
        print(csv_dir+csv_file)
        with open(csv_dir+csv_file, 'r') as r:
            csvReader = csv.reader(r)

            for i, row in enumerate(csvReader):
                all_dates.append(row[0])
                table[row[0]] = row[1]
        fin_table[csv_name] = table
    

# populating the final csv 
with open(csv_out, 'w+') as w:
    csvWriter = csv.writer(w)

    # adding the header
    input_row = ['Dates']
    for stock, table in iter(sorted(fin_table.items())):
        input_row.append(stock)
    csvWriter.writerow(input_row)

    # Adding the rest of the table by available date 
    # since, some dates may not have a value, as you said   
    for date in sorted(all_dates): 
        
        # adding the date rows and stock values 1 by 1
        input_row = [date]
        for stock, table in iter(sorted(fin_table.items())):
            # NOVALUE will be put in cells  
            # where a value doesn't exist for a date
            input_row.append(table.get(date, 'NOVALUE')) 
        csvWriter.writerow(input_row)

# ************************** end of program code

How to merge many CSV files?

Question

2 answers

solution1
1 ACCPTED 2021-02-14 15:10:54

FULL CODE

solution2
0

I'd do something like this with the `csv` module:

How to merge many CSV files?

Question

2 answers

solution1 1 ACCPTED 2021-02-14 15:10:54

FULL CODE

solution2 0

I'd do something like this with the csv module:

solution1
1 ACCPTED 2021-02-14 15:10:54

solution2
0

I'd do something like this with the `csv` module: