[英]Split pandas dataframe into multiple dataframes with list of lists as mask
I have a pandas dataframe tat looks something like this我有一个 pandas dataframe 看起来像这样
A BB
1 foo.bar
2 foo.bar
3 foo.foo
4 foo.bar
5 foo.bar
6 foo.foo
I basically expect to get two dataframes out of them based on this list of lists:我基本上希望根据这个列表列表从中得到两个数据帧:
[[False, False, True], [False, False, True]]
OUTPUT should be: OUTPUT 应该是:
df1: df1:
A BB
1 foo.bar
2 foo.bar
3 foo.foo
df2 DF2
A BB
4 foo.bar
5 foo.bar
6 foo.foo
You can你可以
df.BB
equals 'foo.foo'
获取df.BB
等于'foo.foo'
的行 You end up with a groupby
object that you can turn into a list of sub-dfs.您最终得到一个groupby
object,您可以将其转换为子 df 列表。
>>> groups = df.groupby(df.BB.eq('foo.foo').shift(fill_value=0).cumsum())
>>> frames = [frame for _, frame in groups]
>>> frames # list of sub-dfs
[ A BB
0 1 foo.bar
1 2 foo.bar
2 3 foo.foo,
A BB
3 4 foo.bar
4 5 foo.bar
5 6 foo.foo]
Numpy: Numpy:
flatnonzero
to find where the 'foo.foo'
rows are flatnonzero
查找'foo.foo'
行的位置split
to divide the dataframe up accordingly split
相应地划分 dataframeimport numpy as np
np.split(df, np.flatnonzero(df.BB.eq('foo.foo'))[:-1] + 1)
[ A BB
0 1 foo.bar
1 2 foo.bar
2 3 foo.foo,
A BB
3 4 foo.bar
4 5 foo.bar
5 6 foo.foo]
Addressing @mozway's comment针对@mozway 的评论
list(filter(
lambda d: not d.empty,
np.split(df, np.flatnonzero(df.BB.eq('foo.foo')) + 1)
))
[ A BB
0 1 foo.bar
1 2 foo.bar
2 3 foo.foo,
A BB
3 4 foo.bar
4 5 foo.bar
5 6 foo.foo]
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.