[英]Appending a pandas.DataFrame to one column of another pandas.DataFrame
A similar question has already been asked here , however the exact answer to what the OP wanted was not provided. 这里已经提出了一个类似的问题,但是没有提供 OP 想要什么的确切答案。 So, I will try again.
所以,我会再试一次。 I have data with 4 columns, named
['date', 'log time', 'choice', 'dt between ROI']
.我有 4 列的数据,名为
['date', 'log time', 'choice', 'dt between ROI']
。 I would now like to filter this data based on two criteria, here named ['LED', 'drum']
.我现在想根据两个标准过滤这些数据,这里命名为
['LED', 'drum']
。 In other words, if particular rows of the original pandas.DataFrame correspond to 'LED'
, they get sorted under the 'LED'
column of the master dataframe, and if they correspond to 'drum'
, they get sorted under 'drum'
column of the master dataframe.换句话说,如果原始 pandas.DataFrame 的特定行对应于
'LED'
,它们将在主数据框的'LED'
列下排序,如果它们对应于'drum'
,它们将在'drum'
列下排序主数据框。 In this way, both 'LED'
and 'drum'
columns would have the same 4 subcolumns as the original data, ['date', 'log time', 'choice', 'dt between ROI']
.这样,
'LED'
和'drum'
列都将具有与原始数据相同的 4 个子列, ['date', 'log time', 'choice', 'dt between ROI']
。 Additionally, the 'LED'
and 'drum'
columns would not necessarily have the same number of rows.此外,
'LED'
和'drum'
列不一定具有相同的行数。
To start, I first created the master dataframe with the above described structure:首先,我首先创建了具有上述结构的主数据框:
master_df = pandas.DataFrame({
'distraction': ['LED','LED','LED','LED','drum','drum','drum','drum']),
'': ['date', 'log time', 'choice', 'dt between ROI', 'date', 'log time', 'choice', 'dt between ROI']
})
master_df = master_df.set_index(['distraction', '']).transpose()
This resulted in the desired final structure:这导致了所需的最终结构:
In: master_df
Out:
distraction LED drum
date log time choice dt between ROI date log time choice dt between ROI
In: master_df['LED']
Out:
date log time choice dt between ROI
Next, my filtering function returns certain rows of the original dataframe:接下来,我的过滤函数返回原始数据帧的某些行:
output = filter_function(original_df)
Hence, output
has the same structure as original_df
:因此,
output
与original_df
具有相同的结构:
In: output
Out:
date log time choice dt between ROI
x1 x2 x3 x4
y1 y1 y3 y4
Then I tried appending this output
to the created master dataframe like so:然后我尝试将此
output
附加到创建的主数据帧,如下所示:
master_df['LED'] = master_df['LED'].append(output, ignore_index=True)
which resulted in the following error:导致以下错误:
ValueError: Cannot set a frame with no defined index and a value that cannot be converted to a Series
Next, I tried:接下来,我尝试:
master_df = master_df['LED'].append(output, ignore_index=True)
which simply overwrote the above created structure.它只是覆盖了上面创建的结构。 What I really want to achieve is this:
我真正想要实现的是:
In: master_df['LED'].append(output, ignore_index=True)
Out:
LED drum
date log time choice dt between ROI date log time choice dt between ROI
x1 x2 x3 x4
y1 y1 y3 y4
and likewise:同样:
In: master_df['drum'].append(output, ignore_index=True)
Out:
LED drum
date log time choice dt between ROI date log time choice dt between ROI
x1 x2 x3 x4
y1 y1 y3 y4
I am not sure, if pandas can handle empty rows, but I guess NaN
would be OK.我不确定熊猫是否可以处理空行,但我想
NaN
可以。 After the filtering is done, I then wish to recall the two filtered datasets by simply calling master_df['LED']
or master_df['drum']
.过滤完成后,我希望通过简单地调用
master_df['LED']
或master_df['drum']
来调用两个过滤后的数据集。 Is there a way to do this?有没有办法做到这一点?
Many thanks for your help!非常感谢您的帮助!
EDIT: Fixed criterium
-> distraction
to avoid confusion.编辑:固定
criterium
-> distraction
以避免混淆。
The point of @Dani is that your code does not work. @Dani 的重点是您的代码不起作用。 Anyway it works if one replaces
'criterium'
with 'distraction'
which I assume was the intent.无论如何,如果用我认为是意图的
'distraction'
替换'criterium'
,它就会起作用。
So to your question, you can do the following.因此,对于您的问题,您可以执行以下操作。 You can prepend your
output
with another level of column multi-index so that it matches the column structure of master_df
.您可以在
output
添加另一个级别的列多索引,以便它与master_df
的列结构匹配。 Then you can safely append
or concat
然后你可以安全地
append
或concat
# if this goes into LED group; otherwise use 'drum'
output_LED = pd.concat([output], keys = ['LED'], axis=1)
master_df2 = master_df.append(output_LED)
master_df2
produces产生
LED drum
choice date dt between ROI log time choice date dt between ROI log time
0 x3 x1 x4 x2 NaN NaN NaN NaN
1 y3 y1 y4 y1 NaN NaN NaN NaN
and和
masket_df2['LED']
produces产生
choice date dt between ROI log time
0 x3 x1 x4 x2
1 y3 y1 y4 y1
and和
master_df2['drum']
produces产生
choice date dt between ROI log time
0 NaN NaN NaN NaN
1 NaN NaN NaN NaN
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.