python plotly：不固定数量的痕迹

Question

My code reads data from .xlsx file and it plots the Bubble diagram by using plotly.我的代码从 .xlsx 文件读取数据，并使用 plotly 绘制气泡图。 Bubble Diagram The task is easy when I do know how many traces need to be plot.气泡图当我知道需要绘制多少条轨迹时，任务就很容易了。 However, I was thrown into bewilderment when the number of traces is unfixed since the number of rows is variable.但是，由于行数是可变的，因此当跟踪的数量不固定时，我感到困惑。


       1991  1992  1993  1994  1995  1996  1997
US       10    14    16    18    20    42    64
JAPAN   100    30    70    85    30    42    64
CN       50    22    30    65    70    66    60

Here is my uncompleted code：这是我未完成的代码：

# Version 2 could read data from .xlsx file.
import plotly as py
import plotly.graph_objs as go
import openpyxl

wb = openpyxl.load_workbook(('grape output.xlsx'))     
sheet = wb['Sheet1']       
row_max = sheet.max_row
col_max = sheet.max_column
l=[]

for row_n in range(row_max-1):
    l.append([])
    for col_n in range(col_max-1):
        l[row_n].append(sheet.cell(row=row_n+2, column=col_n+2).value)

trace0 = go.Scatter(
    x=[1991, 1992, 1993, 1994, 1995, 1996, 1997],
    y=['US', 'US', 'US', 'US', 'US', 'US', 'US'],
    mode='markers+text',
    marker=dict(
        color='rgb(150,204,90)',
        size= l[0],
        showscale = False,
        ),
    text=list(map(str, l[0])),     
    textposition='middle center',   
)

trace1 = go.Scatter(
    x=[1991, 1992, 1993, 1994, 1995, 1996, 1997],
    y=['JAPAN', 'JAPAN', 'JAPAN', 'JAPAN', 'JAPAN', 'JAPAN', 'JAPAN'],
    mode='markers+text',

    marker=dict(
        color='rgb(255, 130, 71)',
        size=l[1],
        showscale=False,
    ),
    text=list(map(str,l[1])),
    textposition='middle center',
)

trace2 = go.Scatter(
    x=[1991, 1992, 1993, 1994, 1995, 1996, 1997],
    y=['CN', 'CN', 'CN', 'CN', 'CN', 'CN', 'CN'],
    mode='markers+text',

    marker=dict(
        color='rgb(255, 193, 37)',
        size=l[2],
        showscale=False,
    ),
    text=list(map(str,l[2])),
    textposition='middle center',
)

layout = go.Layout(plot_bgcolor='rgb(10, 10, 10)',  
                   paper_bgcolor='rgb(20, 55, 100)',  
                   font={               
                       'size': 15,
                       'family': 'sans-serif',
                       'color': 'rgb(255, 255, 255)'  
                   },
                   width=1000,
                   height=500,
                   xaxis=dict(title='Output of grapes per year in US, JAPAN and CN', ),  
                   showlegend=False,
                   margin=dict(l=100, r=100, t=100, b=100),
                   hovermode = False,       
                   )

data = [trace0, trace1, trace2]
fig = go.Figure(data=data, layout=layout)


py.offline.init_notebook_mode()
py.offline.plot(fig, filename='basic-scatter.html')

Could you please teach me how to draw them?你能教我怎么画它们吗？ Thx谢谢

Answer 1

Derek O.'s answer is perfect but i think there is a more flexible way to do it using plotly.express this in particular if you don't want to define the colors. Derek O. 的回答是完美的，但我认为有一种更灵活的方法可以使用plotly.express这个，特别是如果你不想定义颜色。

The idea is to properly transform the data.这个想法是正确地转换数据。

Data数据

import pandas as pd
df = pd.DataFrame({1991:[10,100,50], 1992:[14,30,22], 1993:[16,70,30], 1994:[18,85,65], 1995:[20,30,70], 1996:[42,42,66], 1997:[64,64,60]})
df.index = ['US','JAPAN','CN']
df = df.T.unstack()\
      .reset_index()\
      .rename(columns={"level_0": "country",
                       "level_1": "year",
                       0: "n"})
print(df)

   country  year    n
0       US  1991   10
1       US  1992   14
2       US  1993   16
3       US  1994   18
4       US  1995   20
5       US  1996   42
6       US  1997   64
7    JAPAN  1991  100
8    JAPAN  1992   30
9    JAPAN  1993   70
10   JAPAN  1994   85
11   JAPAN  1995   30
12   JAPAN  1996   42
13   JAPAN  1997   64
14      CN  1991   50
15      CN  1992   22
16      CN  1993   30
17      CN  1994   65
18      CN  1995   70
19      CN  1996   66
20      CN  1997   60

Using `plotly.express`使用`plotly.express`

Now that your data is in a long format you can use plotly.express as following现在您的数据是长格式，您可以使用plotly.express如下

import plotly.express as px
fig = px.scatter(df,
                 x="year",
                 y="country",
                 size="n",
                 color="country",
                 text="n",
                 size_max=50 # you need this otherwise the bubble are too small
                )

fig.update_layout(plot_bgcolor='rgb(10, 10, 10)',  
                  paper_bgcolor='rgb(20, 55, 100)',  
                  font={'size': 15,
                        'family': 'sans-serif',
                        'color': 'rgb(255, 255, 255)'
                       },
                  width=1000,
                  height=500,
                  xaxis=dict(title='Output of grapes per year in selected countries', ),  
                  showlegend=False,
                  margin=dict(l=100, r=100, t=100, b=100),
                  hovermode = False,)
# Uncomment this if you don't wont country as yaxis title
# fig.layout.yaxis.title.text = None
fig.show()

Answer 2

I should point out that your code would be more reproducible if you attached your raw data as text or something that can be more easily copy and pasted.我应该指出，如果您将原始数据作为文本或可以更轻松复制和粘贴的内容附加，您的代码将更具可重现性。 However, I can still answer your question and point you in the right direction regardless.但是，无论如何，我仍然可以回答您的问题并为您指明正确的方向。

What you should do is use a loop, and start by looking at the line data = [trace0, trace1, trace2] .您应该做的是使用循环，并从查看行data = [trace0, trace1, trace2] 。 As you noticed, this method won't scale up if you have 100 countries instead of 3.正如您所注意到的，如果您有 100 个国家而不是 3 个，则此方法不会扩大规模。

Instead, you can create the data as a list using a list comprehension, and updating the part of each trace that changes.相反，您可以使用列表推导将data创建为列表，并更新每个跟踪更改的部分。 trace0 , trace1 , trace2 aren't much different except for the country, values, and colors. trace0 、 trace1 、 trace2除了国家、值和颜色之外没有太大不同。 To show you what I mean, I recreated your data using a DataFrame, then created individual lists containing your countries and colors.为了向您展示我的意思，我使用 DataFrame 重新创建了您的数据，然后创建了包含您的国家/地区和颜色的单独列表。

# Version 2 could read data from .xlsx file.
import plotly as py
import plotly.graph_objs as go
import openpyxl

# wb = openpyxl.load_workbook(('grape output.xlsx'))     
# sheet = wb['Sheet1']       
# row_max = sheet.max_row
# col_max = sheet.max_column
# l=[]

# for row_n in range(row_max-1):
#     l.append([])
#     for col_n in range(col_max-1):
#         l[row_n].append(sheet.cell(row=row_n+2, column=col_n+2).value)

import pandas as pd

df = pd.DataFrame({1991:[10,100,50], 1992:[14,30,22], 1993:[16,70,30], 1994:[18,85,65], 1995:[20,30,70], 1996:[42,42,66], 1997:[64,64,60]})
df.index = ['US','JAPAN','CN']
colors = ['rgb(150,204,90)','rgb(255, 130, 71)','rgb(255, 193, 37)']

data = [go.Scatter(
    x=df.columns,
    y=[country]*len(df.columns),
    mode='markers+text',
    marker=dict(
        color=colors[num],
        size= df.loc[country],
        showscale = False,
        ),
    text=list(map(str, df.loc[country])),     
    textposition='middle center',   
    )
    for num, country in enumerate(df.index)
]

layout = go.Layout(plot_bgcolor='rgb(10, 10, 10)',  
                   paper_bgcolor='rgb(20, 55, 100)',  
                   font={               
                       'size': 15,
                       'family': 'sans-serif',
                       'color': 'rgb(255, 255, 255)'  
                   },
                   width=1000,
                   height=500,
                   xaxis=dict(title='Output of grapes per year in US, JAPAN and CN', ),  
                   showlegend=False,
                   margin=dict(l=100, r=100, t=100, b=100),
                   hovermode = False,       
                   )

# data = [trace0, trace1, trace2]
fig = go.Figure(data=data, layout=layout)
fig.show()

# py.offline.init_notebook_mode()
# py.offline.plot(fig, filename='basic-scatter.html')

If I then add a test country to the DataFrame with values for 1991-1997, I don't need to change the rest of the code and the bubble plot will update accordingly.如果我随后将一个测试国家/地区添加到 DataFrame 中，其值为 1991-1997，则不需要更改其余代码，气泡图将相应更新。

# I added a test country with data
df = pd.DataFrame({1991:[10,100,50,10], 1992:[14,30,22,20], 1993:[16,70,30,30], 1994:[18,85,65,40], 1995:[20,30,70,50], 1996:[42,42,66,60], 1997:[64,64,60,70]})
df.index = ['US','JAPAN','CN','TEST']
colors = ['rgb(150,204,90)','rgb(255, 130, 71)','rgb(255, 193, 37)','rgb(100, 100, 100)']

Answer 3

Code has been updated to Version 2 which could read data from .xlsx file and plot Bubble Diagram.代码已更新到版本 2，它可以从 .xlsx 文件中读取数据并绘制气泡图。 The raw data named 'grape output.xlsx' has been added with new items in comparison with the previous one:与之前的数据相比，名为“grape output.xlsx”的原始数据添加了新项目：

             1991  1992  1993  1994  1995  1996  1997  1998  1999
         US    10    14    16    18    20    42    64   100    50
      JAPAN   100    30    70    85    30    42    64    98    24
         CN    50    22    30    65    70    66    60    45    45
      INDIA    90    88    35    50    90    60    40    66    76
         UK    40    50    70    50    25    30    22    40    60

Here is the code:这是代码：

# Version 2 
import plotly as py
import plotly.graph_objs as go
import openpyxl
import pandas as pd


wb = openpyxl.load_workbook('grape output.xlsx')
sheet = wb['Sheet1']
row_max = sheet.max_row
col_max = sheet.max_column
first_row_list = []
first_col_list = []
for col_n in range(2, col_max+1):
    first_row_list.append(sheet.cell(row=1, column=col_n).value)
for row_n in range(2,row_max+1):
    first_col_list.append(sheet.cell(row=row_n, column=1).value)

data_all = pd.read_excel('grape output.xlsx')
data = data_all.loc[:,first_row_list]

df = pd.DataFrame(data)
df.index = first_col_list
colors = ['rgb(150,204,90)','rgb(255, 130, 71)','rgb(255, 193, 37)','rgb(180,240,190)','rgb(255, 10, 1)',
          'rgb(25, 19, 3)','rgb(100, 100, 100)','rgb(45,24,200)','rgb(33, 58, 108)','rgb(35, 208, 232)']

data = [go.Scatter(
    x=df.columns,
    y=[country]*len(df.columns),
    mode='markers+text',
    marker=dict(
        color=colors[num],
        size= df.loc[country],
        showscale = False,
        ),
    text=list(map(str, df.loc[country])),
    textposition='middle center',
    )
    for num, country in enumerate(reversed(df.index))
]

layout = go.Layout(plot_bgcolor='rgb(10, 10, 10)',
                   paper_bgcolor='rgb(20, 55, 100)',
                   font={
                       'size': 15,
                       'family': 'sans-serif',
                       'color': 'rgb(255, 255, 255)'
                   },
                   width=1000,
                   height=800,
                   xaxis=dict(title='Output of grapes per year in US, JAPAN and CN'),
                   showlegend=False,
                   margin=dict(l=100, r=100, t=100, b=100),
                   hovermode = False,
                   )

fig = go.Figure(data=data, layout=layout)
py.offline.plot(fig, filename='basic-scatter.html')

Now the result is like this:现在的结果是这样的： There remains some little problems:还存在一些小问题：

How to get rid of the two numbers 1990 and 2000 as well as white vertical lines for 1990 and 2000?如何去掉1990年和2000年这两个数字以及1990年和2000年的白色竖线？
How to draw white lines for 1991, 1993, 1995, 1997，1999 and display all these years as abscissa axis?如何绘制1991、1993、1995、1997、1999的白线并将这些年份显示为横坐标轴？

Please make corrections for code Versinon 2 to improve it.请更正代码 Versinon 2 以改进它。 Thank you！谢谢！

python plotly：不固定数量的痕迹

问题描述

3 个解决方案

解决方案1
3 2020-09-10 01:34:20

Data数据

Using `plotly.express`使用`plotly.express`

解决方案2
2 已采纳 2020-09-10 00:32:17

解决方案3
0 2020-09-11 16:21:08

python plotly：不固定数量的痕迹

问题描述

3 个解决方案

解决方案1 3 2020-09-10 01:34:20

Data数据

Using plotly.express使用plotly.express

解决方案2 2 已采纳 2020-09-10 00:32:17

解决方案3 0 2020-09-11 16:21:08

解决方案1
3 2020-09-10 01:34:20

Using `plotly.express`使用`plotly.express`

解决方案2
2 已采纳 2020-09-10 00:32:17

解决方案3
0 2020-09-11 16:21:08