[英]Convert array of strings in arrays of integers in a dataframe column
我正在尝试转换 arrays 中的字符串数组,这些整数在 dataframe 列中关联其 ID。
那是因为我需要 map 每个 id 的家庭房间列表,如下所示:
那就是 JSON 我必须 map:
[
{
"id": 1,
"name": "dining room",
},
{
"id": 2,
"name": "living room",
},
{
"id": 3,
"name": "guest room",
},
{
"id": 4,
"name": "bathroom",
},
{
"id": 5,
"name": "game room",
},
{
"id": 6,
"name": "kitchen",
},
{
"id": 7,
"name": "storage room",
},
{
"id": 8,
"name": "bedroom",
},
{
"id": 9,
"name": "family room",
}
]
那就是 dataframe 我有:
index home_rooms
0 [dining room, living room, bathroom]
1 [guest room, kitchen, game room]
2 [storage room, family room, bedroom]
3 [dining room, living room, bathroom]
4 [guest room, kitchen, game room]
5 [storage room, family room, bedroom]
6 [dining room, living room, bathroom]
7 [guest room, kitchen, game room]
8 [storage room, family room, bedroom]
这就是我需要的 dataframe:
index home_rooms
0 [1, 2, 4]
1 [3, 6, 5]
2 [7, 9, 8]
3 [1, 2, 4]
4 [3, 6, 5]
5 [7, 9, 8]
6 [1, 2, 4]
7 [3, 6, 5]
8 [7, 9, 8]
有什么解决办法吗?
提前致谢。
我们将 json 字符串称为l_str
。 将其加载到 dataframe 作为df_map
。 从df_map
构造字典d
中的结构name: id
。 使用itemgetter
和列表理解来构造每个index
的id
列表
from operator import itemgetter
df_map = pd.read_json(l_str)
d = dict(zip(df_map.name, df_map.id))
df['home_rooms'] = [list(itemgetter(*x)(d)) for x in df.home_rooms]
Out[415]:
index home_rooms
0 0 [1, 2, 4]
1 1 [3, 6, 5]
2 2 [7, 9, 8]
3 3 [1, 2, 4]
4 4 [3, 6, 5]
5 5 [7, 9, 8]
6 6 [1, 2, 4]
7 7 [3, 6, 5]
8 8 [7, 9, 8]
尝试:
mapper = pd.read_json(jsonstr).set_index('name')['id']
df_out = df.explode('home_rooms').replace('dinig room', 'dining room') #fix typo with replace
df_out['home_rooms'] = df_out['home_rooms'].map(mapper)
df_out.groupby('index').agg(list).reset_index()
Output:
index home_rooms
0 0 [1, 2, 4]
1 1 [3, 6, 5]
2 2 [7, 9, 8]
3 3 [1, 2, 4]
4 4 [3, 6, 5]
5 5 [7, 9, 8]
6 6 [1, 2, 4]
7 7 [3, 6, 5]
8 8 [7, 9, 8]
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.