簡體   English   中英

Python Pandas-遍歷唯一列

[英]Python Pandas - Iterate over unique columns

我正在嘗試遍歷唯一列值的列表,以在字典中創建帶有字典的三個不同鍵。 這是我現在擁有的代碼:

import pandas as pd

dataDict = {}
metrics = frontendFrame['METRIC'].unique()

for metric in metrics:
    dataDict[metric] = frontendFrame[frontendFrame['METRIC'] == metric].to_dict('records')

print(dataDict)

這適用於少量數據,但是隨着數據量的增加最快可能需要一秒鍾(!!!!)。

我已經嘗試了在熊貓中使用groupby,它甚至更慢,而且還嘗試了map,但是我不想將其返回到列表中。 我該如何迭代並以更快的方式創建我想要的東西? 我正在使用Python 3.6

更新:

輸入:

    DATETIME             METRIC  ANOMALY           VALUE
0   2018-02-27 17:30:32  SCORE      2.0                    -1.0
1   2018-02-27 17:30:32  VALUE      NaN                     0.0
2   2018-02-27 17:30:32  INDEX      NaN  6.6613381477499995E-16
3   2018-02-27 17:31:30  SCORE      2.0                    -1.0
4   2018-02-27 17:31:30  VALUE      NaN                     0.0
5   2018-02-27 17:31:30  INDEX      NaN  6.6613381477499995E-16
6   2018-02-27 17:32:30  SCORE      2.0                    -1.0
7   2018-02-27 17:32:30  VALUE      NaN                     0.0
8   2018-02-27 17:32:30  INDEX      NaN  6.6613381477499995E-16

輸出:

{
  "INDEX": [
{
  "DATETIME": 1519759710000,
  "METRIC": "INDEX",
  "ANOMALY": null,
  "VALUE": "6.6613381477499995E-16"
},
{
  "DATETIME": 1519759770000,
  "METRIC": "INDEX",
  "ANOMALY": null,
  "VALUE": "6.6613381477499995E-16"
}],
  "SCORE": [
{
  "DATETIME": 1519759710000,
  "METRIC": "SCORE",
  "ANOMALY": 2,
  "VALUE": "-1.0"
},
{
  "DATETIME": 1519759770000,
  "METRIC": "SCORE",
  "ANOMALY": 2,
  "VALUE": "-1.0"
}],
  "VALUE": [
{
  "DATETIME": 1519759710000,
  "METRIC": "VALUE",
  "ANOMALY": null,
  "VALUE": "0.0"
},
{
  "DATETIME": 1519759770000,
  "METRIC": "VALUE",
  "ANOMALY": null,
  "VALUE": "0.0"
}]
}

一種可能的解決方案:

a = defaultdict( list )
_ = {x['METRIC']: a[x['METRIC']].append(x) for x in frontendFrame.to_dict('records')}
a = dict(a)

from collections import defaultdict

a = defaultdict( list )
for x in frontendFrame.to_dict('records'):
    a[x['METRIC']].append(x)
a = dict(a)

慢:

dataDict = frontendFrame.groupby('METRIC').apply(lambda x: x.to_dict('records')).to_dict()

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM