简体   繁体   中英

Creating dataframe from dict

first start by creating a list with some values:

list = ['SBSP3.SA', 'CSMG3.SA', 'CGAS5.SA']

I create an empty dictionary because that's the only way I found it to read several.csv files I want as a dataframe. And then I do a for loop to store my.csv files in the empty dictionary:

d = {}

d = {ticker: pd.read_csv('{}.csv'.format(ticker)) for ticker in list}

after that I can only call the dataframe by passing slices with the dictionary keys:

d['SBSP3.SA'].head(5)

          Date   High     Low    Open   Close      Volume   Adj Close
0   2017-01-02  14.70   14.60   14.64   14.66    7525700.0  13.880955
1   2017-01-03  15.65   14.95   14.95   15.50   39947800.0  14.676315
2   2017-01-04  15.68   15.31   15.45   15.50   37071700.0  14.676315
3   2017-01-05  15.91   15.62   15.70   15.75   47586300.0  14.913031
4   2017-01-06  15.92   15.50   15.78   15.66   25592000.0  14.827814

I can't for example:

df = pd.DataFrame(d)

My question is:

Can I merge all these dataframes that I threw in dictionary (d) with axis = 1 to view it as one?

Breaking the head a lot here I managed to put all the dataframes together but I lost their key and I could not distinguish who is who, since the name of the columns is the same.

Can I name these keys in columns?

Example:

          Date    High_SBSP3.SA   Low_SBSP3.SA   Open_SBSP3.SA  Close_SBSP3.SA      Volume_SBSP3.SA   Adj Close_SBSP3.SA
0   2017-01-02            14.70          14.60           14.64           14.66            7525700.0          13.880955
1   2017-01-03            15.65          14.95           14.95           15.50           39947800.0          14.676315
2   2017-01-04            15.68          15.31           15.45           15.50           37071700.0          14.676315
3   2017-01-05            15.91          15.62           15.70           15.75           47586300.0          14.913031
4   2017-01-06            15.92          15.50           15.78           15.66           25592000.0          14.827814

Don't use list as a variable name, it shadows the actual built-in list .

You don't need a dictionary, a simple list is enough to store all your dataframes.

Call pd.concat on this list - it should properly concatenate the dataframes one below the other, as long as they have the same column names.

ticker_list = ['SBSP3.SA', 'CSMG3.SA', 'CGAS5.SA']
pd_list = [pd.read_csv('{}.csv'.format(ticker)) for ticker in ticker_list]
df = pd.concat(pd_list)

Use df = pd.concat(pd_list, ignore_index=True) if you want to reset the indices when concatenating.

pd.merge will do what you want (including renaming columns) but since it only allows for merging two frames at a time the column names will not be consistent when repeating the merge. Thus you need to rename the columns manually before.

import pandas as pd
from functools import reduce

ticker_list = ['SBSP3.SA', 'CSMG3.SA', 'CGAS5.SA']
pd_list = [pd.read_csv('{}.csv'.format(ticker)) for ticker in ticker_list]

for idx, df in enumerate(pd_list):
   old_names = df.columns[1:]
   new_names = list(map(lambda x : x + '_' + ticker_list[idx] , old_names))
   zipped = dict(zip(old_names, new_names))
   df.rename(zipped, axis=1, inplace=True)

def dfmerge(x, y):
    return pd.merge(x, y, on="date")

df = reduce(dfmerge, pd_list)
print(df)

Output (with my data):

         date  High_SBSP3.SA  Low_SBSP3.SA  Open_SBSP3.SA  High_CSMG3.SA  Low_CSMG3.SA  Open_CSMG3.SA  High_CGAS5.SA  Low_CGAS5.SA  Open_CGAS5.SA
0  2017-01-02              1             2              3              1             2              3              1             2              3
1  2017-01-03              4             5              6              4             5              6              4             5              6
2  2017-01-04              7             8              9              7             8              9              7             8              9

Hint : you may need to edit/delete your comment. Since I preferred to overwrite my previous answer instead of adding a new one.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM