極性相當於 pandas set_index() to_dict

Question

假設我有一個與此類似的 Polars dataframe：

import polars as pl
df = pl.DataFrame({'index': [1,2,3,2,1],
                   'object': [1, 1, 1, 2, 2],
                   'period': [1, 2, 4, 4, 23],
                   'value': [24, 67, 89, 5, 23]})


How do I do the following in polars that is easy enough in pandas:
In [2]: df.to_pandas().groupby("index").last().transpose().to_dict()
Out[2]: 
{1: {'object': 2, 'period': 23, 'value': 23},
 2: {'object': 2, 'period': 4, 'value': 5},
 3: {'object': 1, 'period': 4, 'value': 89}}

Answer 1

算法

Polars 沒有索引的概念。 但是我們可以通過使用partition_by來達到相同的結果。

{
    index: frame.select(pl.exclude('index')).to_dicts()[0]
    for index, frame in
        (
            df
            .unique(subset=['index'], keep='last')
            .partition_by(groups=["index"],
                          as_dict=True,
                          maintain_order=True)
        ).items()
}

{1: {'object': 2, 'period': 23, 'value': 23},
2: {'object': 2, 'period': 4, 'value': 5},
3: {'object': 1, 'period': 4, 'value': 89}}

在步驟

該算法的核心是partition_by ，其中as_dict=True 。

(
    df
    .unique(subset=['index'], keep='last')
    .partition_by(groups=["index"],
                  as_dict=True,
                  maintain_order=True)
)

{1: shape: (1, 4)
┌───────┬────────┬────────┬───────┐
│ index ┆ object ┆ period ┆ value │
│ ---   ┆ ---    ┆ ---    ┆ ---   │
│ i64   ┆ i64    ┆ i64    ┆ i64   │
╞═══════╪════════╪════════╪═══════╡
│ 1     ┆ 2      ┆ 23     ┆ 23    │
└───────┴────────┴────────┴───────┘,
2: shape: (1, 4)
┌───────┬────────┬────────┬───────┐
│ index ┆ object ┆ period ┆ value │
│ ---   ┆ ---    ┆ ---    ┆ ---   │
│ i64   ┆ i64    ┆ i64    ┆ i64   │
╞═══════╪════════╪════════╪═══════╡
│ 2     ┆ 2      ┆ 4      ┆ 5     │
└───────┴────────┴────────┴───────┘,
3: shape: (1, 4)
┌───────┬────────┬────────┬───────┐
│ index ┆ object ┆ period ┆ value │
│ ---   ┆ ---    ┆ ---    ┆ ---   │
│ i64   ┆ i64    ┆ i64    ┆ i64   │
╞═══════╪════════╪════════╪═══════╡
│ 3     ┆ 1      ┆ 4      ┆ 89    │
└───────┴────────┴────────┴───────┘}

這將創建一個字典，其中鍵是索引值，值是與每個索引關聯的單行子數據幀。

使用這些字典，我們可以使用 Python 字典理解來構造嵌套字典：

{
    index: frame.to_dicts()
    for index, frame in
        (
            df
            .unique(subset=['index'], keep='last')
            .partition_by(groups=["index"],
                          as_dict=True,
                          maintain_order=True)
        ).items()
}

{1: [{'index': 1, 'object': 2, 'period': 23, 'value': 23}],
2: [{'index': 2, 'object': 2, 'period': 4, 'value': 5}],
3: [{'index': 3, 'object': 1, 'period': 4, 'value': 89}]}

剩下的就是整理 output 以便index不會出現在嵌套字典中，並刪除不需要的列表。

{
    index: frame.select(pl.exclude('index')).to_dicts()[0]
    for index, frame in
        (
            df
            .unique(subset=['index'], keep='last')
            .partition_by(groups=["index"],
                          as_dict=True,
                          maintain_order=True)
        ).items()
}

{1: {'object': 2, 'period': 23, 'value': 23},
2: {'object': 2, 'period': 4, 'value': 5},
3: {'object': 1, 'period': 4, 'value': 89}}

Answer 2

所以如果我們有這個 dict()

df.to_dict()

def create_dict_from_pls(data_in, idx_key):
    out = {}
    for item in range(len(data_in[idx_key])):
        out[data_in[idx_key][item]] = {}
        for key in data_in:
            out[data_in[idx_key][item]][key] = data_in[key][item]
    return out



In [1]: create_dict_from_pls(out, "index")
Out[1]: 
{1: {'index': 1, 'object': 2, 'period': 23, 'value': 23},
 2: {'index': 2, 'object': 2, 'period': 4, 'value': 5},
 3: {'index': 3, 'object': 1, 'period': 4, 'value': 89}}

極性相當於 pandas set_index() to_dict

問題描述

2 個解決方案

解決方案1
1 已采納 2022-07-27 15:00:20

算法

在步驟

解決方案2
0 2022-07-27 18:32:25

極性相當於 pandas set_index() to_dict

問題描述

2 個解決方案

解決方案1 1 已采納 2022-07-27 15:00:20

算法

在步驟

解決方案2 0 2022-07-27 18:32:25

解決方案1
1 已采納 2022-07-27 15:00:20

解決方案2
0 2022-07-27 18:32:25