Pandas row multi-index with multiple columns

Question

I have a data frame containing some recorded data over time from a source, data :

   t  min  max ... some_value
 0.0  0.0  0.0 ...        0.0
 5.0  0.0  2.4 ...        1.9
10.0  0.0  6.7 ...        4.6
 ...  ...  ... ...        ...

I also have a data frame containing information about the source, source :

type location some_info
   A      loc      info

I now want to add source to data in a way that easily lets me get all the data from the selected source as I have data of multiple sources as well as access information about the source corresponding to the current data. My idea was to do this with a multi-index in a way that I have something like

                           data
                              t  min  max ... some_value
source
  type location some_info
     A      loc      info   0.0  0.0  0.0 ...        0.0
                            5.0  0.0  2.4 ...        1.9
                           10.0  0.0  6.7 ...        4.6
   ...      ...       ...   ...  ...  ... ...        ...

Can this be done with a simple concatenation? It feels like it will be trickier than that.

If possible, I want to be able to iterate over the sources in a data frame containing data from multiple sources kind of like the following:

for source in full_frame.index:
    source_data = full_frame[source,:]
    do_something(source_data)

If this approach seems unnecessarily complicated, please let me know.

EDIT: Updated the look of the wanted result

Answer 1

Use:

result = pd.concat([source, data], axis=1, join='outer', keys=['source', 'data'])

The concatenation takes place "by index" in both DataFrames.

I added join='outer' for proper behaviour if one DataFrame has a row for some index value, but the second DataFrame doesn't.

Pandas row multi-index with multiple columns

Question

1 answers

solution1
0 2020-05-04 10:06:06

Pandas row multi-index with multiple columns

Question

1 answers

solution1 0 2020-05-04 10:06:06

solution1
0 2020-05-04 10:06:06