[英]Convert array of strings in arrays of integers in a dataframe column
我正在嘗試轉換 arrays 中的字符串數組,這些整數在 dataframe 列中關聯其 ID。
那是因為我需要 map 每個 id 的家庭房間列表,如下所示:
那就是 JSON 我必須 map:
[
{
"id": 1,
"name": "dining room",
},
{
"id": 2,
"name": "living room",
},
{
"id": 3,
"name": "guest room",
},
{
"id": 4,
"name": "bathroom",
},
{
"id": 5,
"name": "game room",
},
{
"id": 6,
"name": "kitchen",
},
{
"id": 7,
"name": "storage room",
},
{
"id": 8,
"name": "bedroom",
},
{
"id": 9,
"name": "family room",
}
]
那就是 dataframe 我有:
index home_rooms
0 [dining room, living room, bathroom]
1 [guest room, kitchen, game room]
2 [storage room, family room, bedroom]
3 [dining room, living room, bathroom]
4 [guest room, kitchen, game room]
5 [storage room, family room, bedroom]
6 [dining room, living room, bathroom]
7 [guest room, kitchen, game room]
8 [storage room, family room, bedroom]
這就是我需要的 dataframe:
index home_rooms
0 [1, 2, 4]
1 [3, 6, 5]
2 [7, 9, 8]
3 [1, 2, 4]
4 [3, 6, 5]
5 [7, 9, 8]
6 [1, 2, 4]
7 [3, 6, 5]
8 [7, 9, 8]
有什么解決辦法嗎?
提前致謝。
我們將 json 字符串稱為l_str
。 將其加載到 dataframe 作為df_map
。 從df_map
構造字典d
中的結構name: id
。 使用itemgetter
和列表理解來構造每個index
的id
列表
from operator import itemgetter
df_map = pd.read_json(l_str)
d = dict(zip(df_map.name, df_map.id))
df['home_rooms'] = [list(itemgetter(*x)(d)) for x in df.home_rooms]
Out[415]:
index home_rooms
0 0 [1, 2, 4]
1 1 [3, 6, 5]
2 2 [7, 9, 8]
3 3 [1, 2, 4]
4 4 [3, 6, 5]
5 5 [7, 9, 8]
6 6 [1, 2, 4]
7 7 [3, 6, 5]
8 8 [7, 9, 8]
嘗試:
mapper = pd.read_json(jsonstr).set_index('name')['id']
df_out = df.explode('home_rooms').replace('dinig room', 'dining room') #fix typo with replace
df_out['home_rooms'] = df_out['home_rooms'].map(mapper)
df_out.groupby('index').agg(list).reset_index()
Output:
index home_rooms
0 0 [1, 2, 4]
1 1 [3, 6, 5]
2 2 [7, 9, 8]
3 3 [1, 2, 4]
4 4 [3, 6, 5]
5 5 [7, 9, 8]
6 6 [1, 2, 4]
7 7 [3, 6, 5]
8 8 [7, 9, 8]
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.