用 pandas 確定最佳參數組合

Question

我做了一個參數研究（圖像壓縮），它采用三個參數（x1、x2、x3）並為 50 個文件生成結果 y（壓縮率）。 現在我試圖找出哪個參數組合給了我所有文件的最小平均壓縮率。 我可以使用 python for 循環遍歷所有參數組合並存儲最佳結果（如下面的最小示例所示）。 但是，我認為 pandas API 可能會提供更高效、更簡潔的解決方案。

import pandas as pd


df = pd.DataFrame({
    "result": [4, 3, 2, 1],
    "parameter": [1, 0, 1, 0],
    "file": ["A", "A", "B", "B"]
})

min_result = (df["result"][0], None)  # Choosing the first value as starting point
for parameter in [0, 1]:  # Iterating over [0, 1]
    result = df[df["parameter"] == parameter]["result"].mean()  # Mean value of all files
    if result <= min_result[0]:  # Choosing the smallest result
        min_result = (result, parameter)

print(min_result)  # >>> (2.0, 0)

Answer 1

看起來你想要一個簡單的GroupBy.mean ：

out = df.groupby('parameter')['result'].mean()

注意。 如果參數有很多列，請使用： groupby(['col1', 'col2'...])

輸出：

parameter
0    2.0
1    3.0
Name: result, dtype: float64

最低限度：

idx = out.idxmin()
min_result = (out[idx], idx)

輸出： (2.0, 0)

用 pandas 確定最佳參數組合

問題描述

1 個解決方案

解決方案1
3 已采納 2022-07-05 15:28:01

最低限度：

用 pandas 確定最佳參數組合

問題描述

1 個解決方案

解決方案1 3 已采納 2022-07-05 15:28:01

最低限度：

解決方案1
3 已采納 2022-07-05 15:28:01