将 pandas.DataFrame 附加到另一个 pandas.DataFrame 的一列

Question

A similar question has already been asked here , however the exact answer to what the OP wanted was not provided. 这里已经提出了一个类似的问题，但是没有提供 OP 想要什么的确切答案。 So, I will try again.所以，我会再试一次。 I have data with 4 columns, named ['date', 'log time', 'choice', 'dt between ROI'] .我有 4 列的数据，名为['date', 'log time', 'choice', 'dt between ROI'] 。 I would now like to filter this data based on two criteria, here named ['LED', 'drum'] .我现在想根据两个标准过滤这些数据，这里命名为['LED', 'drum'] 。 In other words, if particular rows of the original pandas.DataFrame correspond to 'LED' , they get sorted under the 'LED' column of the master dataframe, and if they correspond to 'drum' , they get sorted under 'drum' column of the master dataframe.换句话说，如果原始 pandas.DataFrame 的特定行对应于'LED' ，它们将在主数据框的'LED'列下排序，如果它们对应于'drum' ，它们将在'drum'列下排序主数据框。 In this way, both 'LED' and 'drum' columns would have the same 4 subcolumns as the original data, ['date', 'log time', 'choice', 'dt between ROI'] .这样， 'LED'和'drum'列都将具有与原始数据相同的 4 个子列， ['date', 'log time', 'choice', 'dt between ROI'] 。 Additionally, the 'LED' and 'drum' columns would not necessarily have the same number of rows.此外， 'LED'和'drum'列不一定具有相同的行数。

To start, I first created the master dataframe with the above described structure:首先，我首先创建了具有上述结构的主数据框：

master_df = pandas.DataFrame({
    'distraction': ['LED','LED','LED','LED','drum','drum','drum','drum']),
    '': ['date', 'log time', 'choice', 'dt between ROI', 'date', 'log time', 'choice', 'dt between ROI']
})

master_df = master_df.set_index(['distraction', '']).transpose()

This resulted in the desired final structure:这导致了所需的最终结构：

In: master_df
Out:
distraction     LED                                             drum
                date    log time    choice    dt between ROI    date    log time    choice    dt between ROI

In: master_df['LED']
Out:
date    log time    choice    dt between ROI

Next, my filtering function returns certain rows of the original dataframe:接下来，我的过滤函数返回原始数据帧的某些行：

output = filter_function(original_df)

Hence, output has the same structure as original_df :因此， output与original_df具有相同的结构：

In: output
Out:
date    log time    choice    dt between ROI
x1      x2          x3        x4
y1      y1          y3        y4

Then I tried appending this output to the created master dataframe like so:然后我尝试将此output附加到创建的主数据帧，如下所示：

master_df['LED'] = master_df['LED'].append(output, ignore_index=True)

which resulted in the following error:导致以下错误：

ValueError: Cannot set a frame with no defined index and a value that cannot be converted to a Series

Next, I tried:接下来，我尝试：

master_df = master_df['LED'].append(output, ignore_index=True)

which simply overwrote the above created structure.它只是覆盖了上面创建的结构。 What I really want to achieve is this:我真正想要实现的是：

In: master_df['LED'].append(output, ignore_index=True)
Out:
LED                                             drum
date    log time    choice    dt between ROI    date    log time    choice    dt between ROI
x1      x2          x3        x4
y1      y1          y3        y4

and likewise:同样：

In: master_df['drum'].append(output, ignore_index=True)
Out:
LED                                             drum
date    log time    choice    dt between ROI    date    log time    choice    dt between ROI
                                                x1      x2          x3        x4
                                                y1      y1          y3        y4

I am not sure, if pandas can handle empty rows, but I guess NaN would be OK.我不确定熊猫是否可以处理空行，但我想NaN可以。 After the filtering is done, I then wish to recall the two filtered datasets by simply calling master_df['LED'] or master_df['drum'] .过滤完成后，我希望通过简单地调用master_df['LED']或master_df['drum']来调用两个过滤后的数据集。 Is there a way to do this?有没有办法做到这一点？

Many thanks for your help!非常感谢您的帮助！

EDIT: Fixed criterium -> distraction to avoid confusion.编辑：固定criterium -> distraction以避免混淆。

Answer 1

The point of @Dani is that your code does not work. @Dani 的重点是您的代码不起作用。 Anyway it works if one replaces 'criterium' with 'distraction' which I assume was the intent.无论如何，如果用我认为是意图的'distraction'替换'criterium' ，它就会起作用。

So to your question, you can do the following.因此，对于您的问题，您可以执行以下操作。 You can prepend your output with another level of column multi-index so that it matches the column structure of master_df .您可以在output添加另一个级别的列多索引，以便它与master_df的列结构匹配。 Then you can safely append or concat然后你可以安全地append或concat

# if this goes into LED group; otherwise use 'drum' 
output_LED = pd.concat([output], keys = ['LED'], axis=1)
master_df2 = master_df.append(output_LED)
master_df2

produces产生

    LED drum
choice  date    dt between ROI  log time    choice  date    dt between ROI  log time
0   x3  x1  x4  x2  NaN NaN NaN NaN
1   y3  y1  y4  y1  NaN NaN NaN NaN

and和

masket_df2['LED']

produces产生


choice  date    dt between ROI  log time
0   x3  x1  x4  x2
1   y3  y1  y4  y1

and和

master_df2['drum']

produces产生


choice  date    dt between ROI  log time
0   NaN NaN NaN NaN
1   NaN NaN NaN NaN

将 pandas.DataFrame 附加到另一个 pandas.DataFrame 的一列

问题描述

1 个解决方案

解决方案1
0 已采纳 2020-11-23 21:43:09

将 pandas.DataFrame 附加到另一个 pandas.DataFrame 的一列

问题描述

1 个解决方案

解决方案1 0 已采纳 2020-11-23 21:43:09

解决方案1
0 已采纳 2020-11-23 21:43:09