![](/img/trans.png)
[英]Python Pandas merge multiple columns into a dictionary column with conditions
[英]Python Pandas merge multiple columns into a dictionary column
我有一个像这样的数据框(df_full):
|cust_id|address |store_id|email |sales_channel|category|
-------------------------------------------------------------------
|1234567|123 Main St|10SjtT |idk@gmail.com|ecom |direct |
|4567345|345 Main St|10SjtT |101@gmail.com|instore |direct |
|1569457|876 Main St|51FstT |404@gmail.com|ecom |direct |
我想将最后 4 个字段组合成一个元数据字段,它是一个像这样的字典:
|cust_id|address |metadata |
-------------------------------------------------------------------------------------------------------------------
|1234567|123 Main St|{'store_id':'10SjtT', 'email':'idk@gmail.com','sales_channel':'ecom', 'category':'direct'} |
|4567345|345 Main St|{'store_id':'10SjtT', 'email':'101@gmail.com','sales_channel':'instore', 'category':'direct'}|
|1569457|876 Main St|{'store_id':'51FstT', 'email':'404@gmail.com','sales_channel':'ecom', 'category':'direct'} |
那可能吗? 我在堆栈溢出方面看到了一些解决方案,但没有一个解决将超过 2 个字段组合到字典字段中的问题。
set_index
df.set_index(['cust_id', 'address']).apply(dict, axis=1).reset_index(name='metadata')
cust_id address metadata
0 1234567 123 Main St {'store_id': '10SjtT', 'email': 'idk@gmail.com...
1 4567345 345 Main St {'store_id': '10SjtT', 'email': '101@gmail.com...
2 1569457 876 Main St {'store_id': '51FstT', 'email': '404@gmail.com...
dat = [(c, a, dict(zip([*df][2:], m))) for c, a, *m in zip(*map(df.get, df))]
pd.DataFrame(dat, df.index, [*df][:2] + ['metadata'])
cust_id address metadata
0 1234567 123 Main St {'store_id': '10SjtT', 'email': 'idk@gmail.com...
1 4567345 345 Main St {'store_id': '10SjtT', 'email': '101@gmail.com...
2 1569457 876 Main St {'store_id': '51FstT', 'email': '404@gmail.com...
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.