簡體   English   中英

Plotly:如何將分類變量插入平行坐標圖中?

[英]Plotly: How to insert a categorical variable into a parallel coordinates plot?

到目前為止,我已經嘗試過這個:

import pandas as pd
import plotly.graph_objects as go

df = pd.read_csv('https://raw.githubusercontent.com/vyaduvanshi/helper-files/master/parallel_coordinates.csv')

dimensions = list([dict(range=[df['gm_Retail & Recreation'].min(),df['gm_Retail & Recreation'].max()],
                        label='Retail & Recreation', values=df['gm_Retail & Recreation']),
                  dict(range=[df['gm_Grocery & Pharmacy'].min(),df['gm_Grocery & Pharmacy'].max()],
                       label='Grocery & Pharmacy', values=df['gm_Grocery & Pharmacy']),
                  dict(range=[df['gm_Parks'].min(),df['gm_Parks'].max()],
                       label='Parks', values=df['gm_Parks']),
                  dict(range=[df['gm_Transit Stations'].min(),df['gm_Transit Stations'].max()],
                       label='Transit Stations', values=df['gm_Transit Stations']),
                  dict(range=[df['gm_Workplaces'].min(),df['gm_Workplaces'].max()],
                       label='Workplaces', values=df['gm_Workplaces']),
                  dict(range=[df['gm_Residential'].min(),df['gm_Residential'].max()],
                       label='Residential', values=df['gm_Residential']),])
#                   dict(range=[0,len(df)], values=df['country'],
#                       label='Country')])

fig = go.Figure(data=go.Parcoords(line = dict(color = '#ff0000',
                   colorscale = 'Electric',
                   showscale = True,
                   cmin = -4000,
                   cmax = -100), dimensions=dimensions))
fig.show()

它返回這個:

在此處輸入圖片說明

我想要做的是將這些行分配給最后一列,即country列(分類)。 (我的嘗試在代碼片段中被注釋掉了)。 我正在考慮如何將這些價值觀與分類國家聯系起來。 索引可能是一種方式? 我還想按國家/地區對線條進行顏色編碼,不同顏色的列表可以幫助我猜。 我被卡住了,可以使用一些幫助。

在您的情況下,您可以通過讓虛擬變量代表df['country]每個唯一元素來實現,您在這里有一個長格式的數據集,因此您將獲得重復的虛擬變量。 不過別擔心,下面的代碼會為你解決這個問題。 然后,您可以將最后一個維度指定為:

dict(range=[0,df['dummy'].max()],
                   tickvals = dfg['dummy'], ticktext = dfg['country'],
                   label='Country', values=df['dummy']),

最后為線條分配顏色范圍,例如:

line = dict(color = df['dummy'],
                   colorscale = [[0,'rgba(200,0,0,0.1)'],[0.5,'rgba(0,200,0,0.1)'],[1,'rgba(0,0,200,0.1)']])

陰謀:

在此處輸入圖片說明

完整代碼:

import pandas as pd
import plotly.graph_objects as go

df = pd.read_csv('https://raw.githubusercontent.com/vyaduvanshi/helper-files/master/parallel_coordinates.csv')
group_vars = df['country'].unique()
dfg = pd.DataFrame({'country':df['country'].unique()})
dfg['dummy'] = dfg.index
df = pd.merge(df, dfg, on = 'country', how='left')


dimensions = list([dict(range=[df['gm_Retail & Recreation'].min(),df['gm_Retail & Recreation'].max()],
                        label='Retail & Recreation', values=df['gm_Retail & Recreation']),
                  dict(range=[df['gm_Grocery & Pharmacy'].min(),df['gm_Grocery & Pharmacy'].max()],
                       label='Grocery & Pharmacy', values=df['gm_Grocery & Pharmacy']),
                  dict(range=[df['gm_Parks'].min(),df['gm_Parks'].max()],
                       label='Parks', values=df['gm_Parks']),
                  dict(range=[df['gm_Transit Stations'].min(),df['gm_Transit Stations'].max()],
                       label='Transit Stations', values=df['gm_Transit Stations']),
                  dict(range=[df['gm_Workplaces'].min(),df['gm_Workplaces'].max()],
                       label='Workplaces', values=df['gm_Workplaces']),
                  dict(range=[df['gm_Residential'].min(),df['gm_Residential'].max()],
                       label='Residential', values=df['gm_Residential']),
                   
                  dict(range=[0,df['dummy'].max()],
                       tickvals = dfg['dummy'], ticktext = dfg['country'],
                       label='Country', values=df['dummy']),
                  
                  ])

fig = go.Figure(data=go.Parcoords(line = dict(color = df['dummy'],
                   colorscale = [[0,'rgba(200,0,0,0.1)'],[0.5,'rgba(0,200,0,0.1)'],[1,'rgba(0,0,200,0.1)']]), dimensions=dimensions))
fig.show()

使用df.infer_objects()自動推斷每列的數據類型。

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM