Python - 多次从另一个df中查找一个值

Question

I have the following two dataframes:我有以下两个数据框：

   prod_id       land_ids
0  1             [1,2]
1  2             [1]
2  3             [2,3,4]
3  4             []
4  5             [3,4]

   land_id       land_desc
0  1             germany
1  2             austria
2  3             switzerland
3  4             italy

Bascially, I want all numbers in column land_ids to individually join the other df.基本上，我希望land_ids 列中的所有数字单独加入另一个df。

The result should look something like this:结果应如下所示：

   prod_id       land_ids  list_land
0  1             [1,2]     germany austria
1  2             [1]       germany
2  3             [2,3,4]   austria switzerland italy
3  4             []     
4  5             [3,4]     switzerland italy

Preferrably, the column list_land is one string where the lands are concatenated.优选地，列list_land 是连接土地的一个字符串。 But I would also be fine with getting a list as a result.但我也可以得到一个列表作为结果。

Any idea on how to do this?关于如何做到这一点的任何想法？

Here is my code for creating the df:这是我创建df的代码：

data_prod = {'prod_id': [1,2,3,4,5], 'land_ids': [[1,2],[1],[2,3,4],[1,3],[3,4]]}
prod_df = pd.DataFrame(data_prod)

data_land = {'land_id': [1,2,3,4], 'land_desc': ['germany', 'austria', 'switzerland', 'italy']}
land_df = pd.DataFrame(data_land)

EDIT: what do I have to add if one value of land_ids is empty?编辑：如果land_ids 的一个值为空，我必须添加什么？

Answer 1

df1 = pd.DataFrame({"prod_id":[1,2,3,4,5],"land_ids":[[1,2],[1],[2,3,4],[1,3],[3,4]]})
df2 = pd.DataFrame({"land_id":[1,2,3,4],"land_ids":["germany","austria","switzerland","italy"]})

df2 = df2.set_index('land_id', drop=True)
df1['list_land'] = df1['land_ids'].apply(lambda x: [df2.at[ids, 'land_desc'] for ids in x])

If you want to get list_land as a string, than you can do like this.如果您想将list_land作为字符串获取，则可以这样做。

df1['list_land'] = df1['land_ids'].apply(lambda x: " ".join([df2.at[ids, 'land_desc'] for ids in x]))

Answer 2

you can use the apply method:你可以使用apply方法：

prod_df['list_land'] = prod_df['land_ids'].apply(lambda x: [land_df.loc[land_df['land_id'] == y]['land_ids'].values[0] for y in x])

In this case, the list_land column is a list.在这种情况下， list_land列是一个列表。 You can use the following code if you want it to be a string.如果您希望它是一个字符串，您可以使用以下代码。

prod_df['list_land'] = prod_df['land_ids'].apply(lambda x: ' '.joind([land_df.loc[land_df['land_id'] == y]['land_ids'].values[0] for y in x]))

Answer 3

Maybe something like this:也许是这样的：

import pandas as pd 


df1 = pd.DataFrame({"prod_id":[1,2,3,4,5],"land_ids":[[1,2],[1],[2,3,4],[1,3],[3,4]]})
df2 = pd.DataFrame({"land_id":[1,2,3,4],"land_ids":["germany","austria","switzerland","italy"]})

list_land = []

for index, row in df1.iterrows():
    list_land.append([row2.land_ids for land_id in row["land_ids"] for _, row2 in df2.iterrows() if row2.land_id == land_id])
df1["list_land"] = list_land

Python - 多次从另一个df中查找一个值

问题描述

3 个解决方案

解决方案1
1 2022-06-16 08:42:29

解决方案2
1 已采纳 2022-06-16 08:52:23

解决方案3
0 2022-06-16 08:50:05

Python - 多次从另一个df中查找一个值

问题描述

3 个解决方案

解决方案1 1 2022-06-16 08:42:29

解决方案2 1 已采纳 2022-06-16 08:52:23

解决方案3 0 2022-06-16 08:50:05

解决方案1
1 2022-06-16 08:42:29

解决方案2
1 已采纳 2022-06-16 08:52:23

解决方案3
0 2022-06-16 08:50:05