來自熊貓交叉表的 Plotly 氣泡圖

Question

如何根據從另一個數據框的熊貓交叉表創建的數據框繪制氣泡圖？

進口；

import plotly as py
import plotly.graph_objects as go
from plotly.subplots import make_subplots

交叉表是使用創建的；

df = pd.crosstab(raw_data['Speed'], raw_data['Height'].fillna('n/a'))

df 主要包含零，但是在出現數字的地方我想要一個值控制點大小的點。 我想將索引值設置為 x 軸，將列名值設置為 Y 軸。

df 看起來像；

         10    20    30    40    50
1000     0     0    0      0     5
1100     0     0    0      7     0
1200     1     0    3      0     0
1300     0     0    0      0     0
1400     5     0    0      0     0

我試過像這樣使用 scatter & Scatter ；

fig.add_trace(go.Scatter(x=df.index.values, y=df.columns.values, size=df.values,
                         mode='lines'),
              row=1, col=3)

這返回了一個 TypeError: 'Module' object not callable。

任何幫助真的很感激。 謝謝

更新

下面的答案與我最終得到的答案很接近，主要區別在於我在熔化線中引用了“速度”；

df.reset_index()
df.melt(id_vars="Speed")
df.rename(columns={"index":"Engine Speed",
                    "variable":"Height",
                    "value":"Count"})
df[df!=0].dropna()

scale=1000

fig.add_trace(go.Scatter(x=df["Speed"], y=df["Height"],mode='markers',marker_size=df["Count"]/scale),
              row=1, col=3)

這行得通，但是我現在的主要問題是數據集很大，而且 plotly 真的很難處理它。

更新 2

使用 Scattergl 可以讓 Plotly 很好地處理大型數據集！

Answer 1

我建議使用tidy 格式來表示您的數據。 我們說一個數據框是整潔的當且僅當

每一行都是一個觀察
每一列都是一個變量
每個值必須有自己的單元格

要創建更整潔的數據框，您可以執行以下操作

df = pd.crosstab(raw_data["Speed"], raw_data["Height"])
df.reset_index(level=0, inplace=True)
df.melt(id_vars=["Speed", "Height"], value_vars=["Counts"])

   Speed  Height  Counts
0   1000      10       2
1   1100      20       1
2   1200      10       1
3   1200      30       1
4   1300      40       1
5   1400      50       1

下一步是進行實際繪圖。

# when scale is increased bubbles will become larger
scale = 10 
# create the scatter plot
scatter = go.Scatter(
    x=df.Speed, 
    y=df.Height,
    marker_size=df.counts*scale,
    mode='markers')
fig = go.Figure(scatter)
fig.show()

這將創建一個圖，如下所示。

Answer 2

如果是這種情況，您可以使用plotly.express這與@Erik 的答案非常相似，但不應返回錯誤。

import pandas as pd
import plotly.express as px
from io import StringIO

txt = """
        10    20    30    40    50
1000     0     0    0      0     5
1100     0     0    0      7     0
1200     1     0    3      0     0
1300     0     0    0      0     0
1400     5     0    0      0     0
"""

df = pd.read_csv(StringIO(txt), delim_whitespace=True)

df = df.reset_index()\
       .melt(id_vars="index")\
       .rename(columns={"index":"Speed",
                        "variable":"Height",
                        "value":"Count"})

fig = px.scatter(df, x="Speed", y="Height",size="Count")
fig.show()

更新如果您遇到錯誤，請使用pd.__version__檢查您的pandas version並嘗試逐行檢查

df = pd.read_csv(StringIO(txt), delim_whitespace=True)

df = df.reset_index()

df = df.melt(id_vars="index")

df = df.rename(columns={"index":"Speed",
                        "variable":"Height",
                        "value":"Count"})

並報告它在哪一行中斷。

來自熊貓交叉表的 Plotly 氣泡圖

問題描述

2 個解決方案

解決方案1
1 2020-02-25 13:49:33

解決方案2
1 已采納 2020-02-25 14:06:09

來自熊貓交叉表的 Plotly 氣泡圖

問題描述

2 個解決方案

解決方案1 1 2020-02-25 13:49:33

解決方案2 1 已采納 2020-02-25 14:06:09

解決方案1
1 2020-02-25 13:49:33

解決方案2
1 已采納 2020-02-25 14:06:09