first start by creating a list with some values:
list = ['SBSP3.SA', 'CSMG3.SA', 'CGAS5.SA']
I create an empty dictionary because that's the only way I found it to read several.csv files I want as a dataframe. And then I do a for loop to store my.csv files in the empty dictionary:
d = {}
d = {ticker: pd.read_csv('{}.csv'.format(ticker)) for ticker in list}
after that I can only call the dataframe by passing slices with the dictionary keys:
d['SBSP3.SA'].head(5)
Date High Low Open Close Volume Adj Close
0 2017-01-02 14.70 14.60 14.64 14.66 7525700.0 13.880955
1 2017-01-03 15.65 14.95 14.95 15.50 39947800.0 14.676315
2 2017-01-04 15.68 15.31 15.45 15.50 37071700.0 14.676315
3 2017-01-05 15.91 15.62 15.70 15.75 47586300.0 14.913031
4 2017-01-06 15.92 15.50 15.78 15.66 25592000.0 14.827814
I can't for example:
df = pd.DataFrame(d)
My question is:
Can I merge all these dataframes that I threw in dictionary (d) with axis = 1 to view it as one?
Breaking the head a lot here I managed to put all the dataframes together but I lost their key and I could not distinguish who is who, since the name of the columns is the same.
Can I name these keys in columns?
Example:
Date High_SBSP3.SA Low_SBSP3.SA Open_SBSP3.SA Close_SBSP3.SA Volume_SBSP3.SA Adj Close_SBSP3.SA
0 2017-01-02 14.70 14.60 14.64 14.66 7525700.0 13.880955
1 2017-01-03 15.65 14.95 14.95 15.50 39947800.0 14.676315
2 2017-01-04 15.68 15.31 15.45 15.50 37071700.0 14.676315
3 2017-01-05 15.91 15.62 15.70 15.75 47586300.0 14.913031
4 2017-01-06 15.92 15.50 15.78 15.66 25592000.0 14.827814
Don't use list
as a variable name, it shadows the actual built-in list
.
You don't need a dictionary, a simple list is enough to store all your dataframes.
Call pd.concat
on this list - it should properly concatenate the dataframes one below the other, as long as they have the same column names.
ticker_list = ['SBSP3.SA', 'CSMG3.SA', 'CGAS5.SA']
pd_list = [pd.read_csv('{}.csv'.format(ticker)) for ticker in ticker_list]
df = pd.concat(pd_list)
Use df = pd.concat(pd_list, ignore_index=True)
if you want to reset the indices when concatenating.
pd.merge will do what you want (including renaming columns) but since it only allows for merging two frames at a time the column names will not be consistent when repeating the merge. Thus you need to rename the columns manually before.
import pandas as pd
from functools import reduce
ticker_list = ['SBSP3.SA', 'CSMG3.SA', 'CGAS5.SA']
pd_list = [pd.read_csv('{}.csv'.format(ticker)) for ticker in ticker_list]
for idx, df in enumerate(pd_list):
old_names = df.columns[1:]
new_names = list(map(lambda x : x + '_' + ticker_list[idx] , old_names))
zipped = dict(zip(old_names, new_names))
df.rename(zipped, axis=1, inplace=True)
def dfmerge(x, y):
return pd.merge(x, y, on="date")
df = reduce(dfmerge, pd_list)
print(df)
Output (with my data):
date High_SBSP3.SA Low_SBSP3.SA Open_SBSP3.SA High_CSMG3.SA Low_CSMG3.SA Open_CSMG3.SA High_CGAS5.SA Low_CGAS5.SA Open_CGAS5.SA
0 2017-01-02 1 2 3 1 2 3 1 2 3
1 2017-01-03 4 5 6 4 5 6 4 5 6
2 2017-01-04 7 8 9 7 8 9 7 8 9
Hint : you may need to edit/delete your comment. Since I preferred to overwrite my previous answer instead of adding a new one.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.