[英]Split data frame into multiple data frames based on a group of parameters in a column
[英]Split data frame into multiple data frames based on unique column combinations
我有以下数据框:
import pandas as pd
units = [1, 1, 1, 5, 5, 5]
locations = [30, 30, 30, 32, 32, 32]
timestamps = [1, 2, 3, 1, 2, 3]
quantities = [1, 5, 3, 10, 35, 39]
data = {'units': units, 'locations': locations, 'timestamps': timestamps,
'quantities': quantities}
df = pd.DataFrame(data=data)
看起来像这样:
🐍 >>> df
units locations timestamps quantities
0 1 30 1 1
1 1 30 2 5
2 1 30 3 3
3 5 32 1 10
4 5 32 2 35
5 5 32 3 39
我需要从单位和位置的所有独特组合中获取数据框列表,即使用df.groupby(['units', 'locations'])
。 最终结果应该是这样的:
(1, 30)
timestamps quantities
0 1 1
1 2 5
2 3 3
(5, 32)
timestamps quantities
3 1 10
4 2 35
5 3 39
请问这可能吗?
通过 groupby 运行字典理解。 您可以在 Pandas doc for groupby:split-apply-combine页面上阅读更多相关信息:
d = {name:group.filter(['timestamps','quantities'])
for name, group in df.groupby(['units','locations'])}
#print(d.keys())
#dict_keys([(1, 30), (5, 32)])
print(d[(1,30)])
timestamps quantities
0 1 1
1 2 5
2 3 3
print(d[(5,32)])
timestamps quantities
3 1 10
4 2 35
5 3 39
另一种方法是将 dict comp 与groupby
和concat
d = pd.concat(({combo : data for combo,data in df.groupby(['units','locations'])}))
print(d)
units locations timestamps quantities
1 30 0 1 30 1 1
1 1 30 2 5
2 1 30 3 3
5 32 3 5 32 1 10
4 5 32 2 35
5 5 32 3 39
你是对的,它只是 groupby:
cols = ['units','locations']
for k, d in df.drop(cols, axis=1).groupby([df[c] for c in cols]):
print(k)
print(d)
输出:
(1, 30)
timestamps quantities
0 1 1
1 2 5
2 3 3
(5, 32)
timestamps quantities
3 1 10
4 2 35
5 3 39
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.