简体   繁体   中英

Pandas row multi-index with multiple columns

I have a data frame containing some recorded data over time from a source, data :

   t  min  max ... some_value
 0.0  0.0  0.0 ...        0.0
 5.0  0.0  2.4 ...        1.9
10.0  0.0  6.7 ...        4.6
 ...  ...  ... ...        ...

I also have a data frame containing information about the source, source :

type location some_info
   A      loc      info

I now want to add source to data in a way that easily lets me get all the data from the selected source as I have data of multiple sources as well as access information about the source corresponding to the current data. My idea was to do this with a multi-index in a way that I have something like

                           data
                              t  min  max ... some_value
source
  type location some_info
     A      loc      info   0.0  0.0  0.0 ...        0.0
                            5.0  0.0  2.4 ...        1.9
                           10.0  0.0  6.7 ...        4.6
   ...      ...       ...   ...  ...  ... ...        ...

Can this be done with a simple concatenation? It feels like it will be trickier than that.

If possible, I want to be able to iterate over the sources in a data frame containing data from multiple sources kind of like the following:

for source in full_frame.index:
    source_data = full_frame[source,:]
    do_something(source_data)

If this approach seems unnecessarily complicated, please let me know.

EDIT: Updated the look of the wanted result

Use:

result = pd.concat([source, data], axis=1, join='outer', keys=['source', 'data'])

The concatenation takes place "by index" in both DataFrames.

I added join='outer' for proper behaviour if one DataFrame has a row for some index value, but the second DataFrame doesn't.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM