简体   繁体   中英

Manipulating Pandas Dataframe with MultiIndex

I have a pandas DataFrame formatted as such:

  mesh 1          energy low [eV] energy high [eV] nuclide score     mean  
           x    y   z                                                           
0          1    1   1        1.00e-03         2.00e+07   total  flux 0.00e+00   
1          1    1   2        1.00e-03         2.00e+07   total  flux 1.82e-03   
2          1    1   3        1.00e-03         2.00e+07   total  flux 6.96e-03   
3          1    1   4        1.00e-03         2.00e+07   total  flux 1.47e-03   
4          1    1   5        1.00e-03         2.00e+07   total  flux 6.93e-03   
5          1    1   6        1.00e-03         2.00e+07   total  flux 8.73e-03   
6          1    1   7        1.00e-03         2.00e+07   total  flux 1.34e-02   
7          1    1   8        1.00e-03         2.00e+07   total  flux 1.16e-02   
8          1    1   9        1.00e-03         2.00e+07   total  flux 4.14e-03   
9          1    1  10        1.00e-03         2.00e+07   total  flux 5.26e-03   
10         1    2   1        1.00e-03         2.00e+07   total  flux 6.16e-03   
11         1    2   2        1.00e-03         2.00e+07   total  flux 1.76e-02   
12         1    2   3        1.00e-03         2.00e+07   total  flux 1.80e-02   
13         1    2   4        1.00e-03         2.00e+07   total  flux 1.97e-02   
14         1    2   5        1.00e-03         2.00e+07   total  flux 1.76e-02   
15         1    2   6        1.00e-03         2.00e+07   total  flux 1.90e-02   
16         1    2   7        1.00e-03         2.00e+07   total  flux 3.53e-02   
17         1    2   8        1.00e-03         2.00e+07   total  flux 0.00e+00   
18         1    2   9        1.00e-03         2.00e+07   total  flux 0.00e+00   
19         1    2  10        1.00e-03         2.00e+07   total  flux 0.00e+00   
20         1    3   1        1.00e-03         2.00e+07   total  flux 0.00e+00   
21         1    3   2        1.00e-03         2.00e+07   total  flux 0.00e+00   
22         1    3   3        1.00e-03         2.00e+07   total  flux 0.00e+00   
23         1    3   4        1.00e-03         2.00e+07   total  flux 0.00e+00   
24         1    3   5        1.00e-03         2.00e+07   total  flux 0.00e+00   
25         1    3   6        1.00e-03         2.00e+07   total  flux 0.00e+00   
26         1    3   7        1.00e-03         2.00e+07   total  flux 0.00e+00   
27         1    3   8        1.00e-03         2.00e+07   total  flux 0.00e+00   
28         1    3   9        1.00e-03         2.00e+07   total  flux 0.00e+00   
29         1    3  10        1.00e-03         2.00e+07   total  flux 0.00e+00   
...      ...  ...  ..             ...              ...     ...   ...      ...   
99970    100   98   1        1.00e-03         2.00e+07   total  flux 0.00e+00   
99971    100   98   2        1.00e-03         2.00e+07   total  flux 0.00e+00   
99972    100   98   3        1.00e-03         2.00e+07   total  flux 0.00e+00   
99973    100   98   4        1.00e-03         2.00e+07   total  flux 0.00e+00   
99974    100   98   5        1.00e-03         2.00e+07   total  flux 0.00e+00   
99975    100   98   6        1.00e-03         2.00e+07   total  flux 0.00e+00   
99976    100   98   7        1.00e-03         2.00e+07   total  flux 0.00e+00   
99977    100   98   8        1.00e-03         2.00e+07   total  flux 0.00e+00   
99978    100   98   9        1.00e-03         2.00e+07   total  flux 0.00e+00   
99979    100   98  10        1.00e-03         2.00e+07   total  flux 0.00e+00   
99980    100   99   1        1.00e-03         2.00e+07   total  flux 0.00e+00   
99981    100   99   2        1.00e-03         2.00e+07   total  flux 0.00e+00   
99982    100   99   3        1.00e-03         2.00e+07   total  flux 0.00e+00   
99983    100   99   4        1.00e-03         2.00e+07   total  flux 0.00e+00   
99984    100   99   5        1.00e-03         2.00e+07   total  flux 0.00e+00   
99985    100   99   6        1.00e-03         2.00e+07   total  flux 0.00e+00   
99986    100   99   7        1.00e-03         2.00e+07   total  flux 0.00e+00   
99987    100   99   8        1.00e-03         2.00e+07   total  flux 0.00e+00   
99988    100   99   9        1.00e-03         2.00e+07   total  flux 0.00e+00   
99989    100   99  10        1.00e-03         2.00e+07   total  flux 0.00e+00   
99990    100  100   1        1.00e-03         2.00e+07   total  flux 0.00e+00   
99991    100  100   2        1.00e-03         2.00e+07   total  flux 0.00e+00   
99992    100  100   3        1.00e-03         2.00e+07   total  flux 0.00e+00   
99993    100  100   4        1.00e-03         2.00e+07   total  flux 0.00e+00   
99994    100  100   5        1.00e-03         2.00e+07   total  flux 0.00e+00   
99995    100  100   6        1.00e-03         2.00e+07   total  flux 0.00e+00   
99996    100  100   7        1.00e-03         2.00e+07   total  flux 0.00e+00   
99997    100  100   8        1.00e-03         2.00e+07   total  flux 0.00e+00   
99998    100  100   9        1.00e-03         2.00e+07   total  flux 0.00e+00   
99999    100  100  10        1.00e-03         2.00e+07   total  flux 0.00e+00   

RangeIndex(start=0, stop=100000, step=1)
MultiIndex(levels=[['energy high [eV]', 'energy low [eV]', 'mean', 'mesh 1', 'nuclide', 'score', 'std. dev.'], ['', 'x', 'y', 'z']],
           labels=[[3, 3, 3, 1, 0, 4, 5, 2, 6], [1, 2, 3, 0, 0, 0, 0, 0, 0]])

I would like to have 10 pandas dataframes (since 'mesh 1', 'z' goes to 10) in a list where in each dataframe the rows are ('mesh 1', 'y'), the columns are ('mesh 1', 'x') and the values are 'mean'. I have figured out how to get the 10 dataframes in a list:

axial_dfs = []
    for i in range(10):
        temp_df = flux_df[flux_df['mesh 1']['z'] == i]
        axial_dfs.append(temp_df)

But I can't figure out how to change the rows and columns. I would try pivot but I don't know how with the MultiIndex for 'mesh 1'.

Appreciate all the help! Thanks!

I'm a little confused about what you need but I think merging the column levels together in your temp_df will help you:

axial_dfs = []
    for i in range(10):
        temp_df = flux_df[flux_df['mesh 1']['z'] == i]
        temp_df.columns = temp_df.columns.map('_'.join)  # add this line
        axial_dfs.append(temp_df)

Now, all of the frames in axial_dfs will have one level of columns (eg mesh 1_x or mesh 1_y ), which it sounds like you're comfortable manipulating on your own (using pandas.DataFrame.pivot_table or pandas.DataFrame.groupby ).

In the following example, I use unstack to turn the second index level into a column index. Then, I use a list comprehension to split the result into a list determined by the levels of the first index.

import pandas as pd
import numpy as np

# Create simple example
data = np.random.randint(8, size=(8, 2))
levels = [['df1', 'df2'], ['a', 'b'], [1, 2]]
idx = pd.MultiIndex.from_product(levels, names=['first', 'second', 'third'])
df = pd.DataFrame(data, index=idx, columns=['col1', 'col2'])

# Step 1: unstack to get second level as column index
df = df.unstack(level='second')['col2']

# Step 2: get a list of chunks of df by first index level
first_unique = df.index.get_level_values('first').unique()
df_ls = [df.loc[x] for x in first_unique]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM