制作一个 zip() pandas 列来总结具有唯一索引的其他列

Question

I have a DataFrame with 3 columns:我有一个包含 3 列的 DataFrame：

store店铺
product产品
price价格

For each store we have multiple products, but each product has a unique price.对于每个商店，我们有多种产品，但每种产品都有唯一的价格。 The DataFrame is hence composed of multiple rows on the same store, each row corresponding to a product.因此，DataFrame 由同一商店的多行组成，每行对应一个产品。

I would like to make some transformations on the dataset to get only one line per store, and a compound column that would sum up info about products and prices as follow:我想对数据集进行一些转换，以便每家商店只获得一行，以及一个复合列来汇总有关产品和价格的信息，如下所示：

[(product_1,price_1),(product_2,price_2), ...]

For now I've not been able to do it.现在我还做不到。

What I have done is that I've grouped by store , aggregated by product, and applied the .unique() function. I get for each store, a list of all the products, but not the prices.我所做的是按store分组，按产品汇总，并应用.unique() function。我为每个商店获取所有产品的列表，但不是价格。 When I try to add price to the .agg() function followed by .unique() it doesn't work and have no clue how to do this.当我尝试将price添加到.agg() function 后跟.unique()它不起作用并且不知道如何执行此操作。

I guess I might have to apply some zipping at some point: zip(product, price) but I don't get until there.我想我可能不得不在某个时候应用一些压缩： zip(product, price)但直到那里我才明白。

Any help is appreciated, thanks!任何帮助表示赞赏，谢谢！

Answer 1

df.groupby("store", as_index = False).apply(lambda x: pd.Series({'store': x["store"].iloc[0],
                                                                "result": [(val["product"], val["price"]) for idx, val in x.iterrows()]}))

制作一个 zip() pandas 列来总结具有唯一索引的其他列

问题描述

1 个解决方案

解决方案1
0 已采纳 2020-10-27 12:31:03

制作一个 zip() pandas 列来总结具有唯一索引的其他列

问题描述

1 个解决方案

解决方案1 0 已采纳 2020-10-27 12:31:03

解决方案1
0 已采纳 2020-10-27 12:31:03