Y轴Matplotlib上的双标签

Question

我制作了一个带有散点图的barh图。 数据约为100本书，出版日期以及作者出生和死亡的年份。 栏显示作者活着的时间，散点图显示在那里出版的年份。

我面临的问题是能够在一栏上绘制多本书。 由于我现在使用不同的书籍重复制作酒吧。 我根据数组中的位置创建y轴，稍后再添加标签。

我的相关代码：

# dataframe columns to arrays. (dataset is my pandas dataframe)
begin = np.array(dataset.BORN)
end = np.array(dataset.DIED)
book = np.array(dataset['YEAR (BOOK)'])

# Data to a barh graph (sideways bar)
plt.barh(range(len(begin)), end-begin, left=begin, zorder=2, 
color='#007acc', alpha=0.8, linewidth=5)

# Plots the books in a scatterplot. Changes marker color and shape.
plt.scatter(book, range(len(begin)), color='purple', s=30, marker='D', zorder=3)

# Sets the titles of the y-axis.
plt.yticks(range(len(begin)), dataset.AUTHOR)

# Sets start and end of the x-axis.
plt.xlim([1835, 2019])

# Shows the plt
plt.show()

该图显示了我当前图形的一部分：

Answer 1

我汇总了您的数据集，以便使用groupby每一行中获得一位作者，并使用它来绘制条形图，然后将其重新加入以获取用于绘制书籍的值，例如：

import pandas as pd
import matplotlib.pyplot as plt

df = pd.DataFrame([
    ['foo', 1950, 1990, 1980],
    ['foo', 1950, 1990, 1985],
    ['bar', 1930, 2000, 1970],
], columns=['author', 'born', 'died', 'published'])

提取软件包并创建一个虚拟数据集，接下来我们将其缩减为每个作者单行，以获取他们的出生和死亡时间：

agg = df.groupby('author')['born', 'died'].agg(min).reset_index()
agg['auth_num'] = range(len(agg))

该reset_index让author回一个正常的列，我们创建了一个任意auth_num列，你可能想要把一个sort_values在那里，如果你想比他们的名字以外的东西（我会建议作为一般字母排序作者ISN不是最有用的）

接下来，我们可以将其重新加入原始数据集，以获取每本书的作者编号：

df2 = pd.merge(df, agg[['author', 'auth_num']], on='author')

最后绘制所有内容：

plt.barh(agg.auth_num, agg.died - agg.born, left=agg.born, zorder=-1, alpha=0.5)
plt.yticks(agg.auth_num, agg.author)

plt.scatter(df2.published, df2.auth_num)

给出类似的东西：

注意：如果你设置use_sticky_edges到False调用之前barh ，它会允许X轴自动缩放，因此最左边的作者不会“粘”在左边距

Answer 2

当然，可以使用几个选项。 您可以为第一，第二，第三本书创建另一个数组。 或者，您可以创建字典或数组列表来绘制每位作者的书。

我在下面使用虚拟数据重现了一些示例。

import matplotlib.pyplot as plt
import numpy as np

fig,axs = plt.subplots(1,1,figsize=(10,10))

# dataframe columns to arrays. (dataset is my pandas dataframe)
begin = np.arange(1900,1950)
end = np.arange(1975,2025)

# create two random arrays for your book dates
book1 = np.array(np.random.randint(low=1950, high=1970, size=50))
book2 = np.array(np.random.randint(low=1950, high=1970, size=50))

# add some athor names
author_names = [f'Author_{x+1}' for x in range(50)]

# Data to a barh graph (sideways bar)
axs.barh(range(len(begin)), end-begin, left=begin, zorder=2, 
color='#007acc', alpha=0.8, linewidth=5)

# Plots the books in a scatterplot. Changes marker color and shape.
axs.scatter(book1, range(len(begin)), color='purple', s=30, marker='D', zorder=3, label='1st Book')

# second array of books
axs.scatter(book2, range(len(begin)), color='yellow', s=30, marker='D', zorder=3, label='2nd Book')

# or plot a custom array of books
# you could do this in a for loop for all authors
axs.scatter(x=[1980,2005], y=[10,45], color='red', s=50, marker='X', zorder=3, label='3rd Book')

# Sets the titles of the y-axis.
axs.set_yticks(range(len(begin)))
axs.set_yticklabels(author_names)

# Add legend
axs.legend()

# Sets start and end of the x-axis.
axs.set_xlim([1895, 2025])
axs.set_ylim([-1,50]);

Answer 3

（下次请提供一个数据框示例！）

我将使用伟大的numpy.unique方法执行分组操作。

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt


dataset = pd.DataFrame({'BORN': [1900, 1920, 1900],
                        'DIED': [1980, 1978, 1980],
                        'AUTHOR': ['foo', 'bar', 'foo'],
                        'YEAR (BOOK)': [1950, 1972, 1961]})

# --group by author
unique_authors, index, reverse_index = np.unique(dataset.AUTHOR.values, return_index=True, return_inverse=True)
authors_df = dataset.loc[index, ['AUTHOR', 'BORN', 'DIED']]
dataset['AUTHOR_IDX'] = reverse_index  # remember the index

# dataframe columns to arrays.
begin = authors_df.BORN.values
end = authors_df.DIED.values
authors = authors_df.AUTHOR.values

# --Author data to a barh graph (sideways bar)
plt.barh(range(len(begin)), end-begin, left=begin, zorder=2, color='#007acc', alpha=0.8, linewidth=5)

# Sets the titles of the y-axis.
plt.yticks(range(len(begin)), authors)

# Sets start and end of the x-axis.
plt.xlim([1835, 2019])

# --Overlay book information
# dataframe columns to arrays
book = dataset['YEAR (BOOK)'].values

# Plots the books in a scatterplot. Changes marker color and shape.
plt.scatter(book, reverse_index, color='purple', s=30, marker='D', zorder=3)

# Shows the plt
plt.show()

产量：

Y轴Matplotlib上的双标签

问题描述

3 个解决方案

解决方案1
1 已采纳 2019-09-03 12:23:47

解决方案2
0 2019-09-03 12:00:01

解决方案3
0 2019-09-03 12:41:50

Y轴Matplotlib上的双标签

问题描述

3 个解决方案

解决方案1 1 已采纳 2019-09-03 12:23:47

解决方案2 0 2019-09-03 12:00:01

解决方案3 0 2019-09-03 12:41:50

解决方案1
1 已采纳 2019-09-03 12:23:47

解决方案2
0 2019-09-03 12:00:01

解决方案3
0 2019-09-03 12:41:50