简体   繁体   English

将熊猫 Dataframe 绘制成 3D 条形图

[英]Plotting Panda Dataframe into 3D bar chart

There is one existing answer to this question here but it is wrong. 这里有一个对这个问题的现有答案,但它是错误的。

In the example dataframe from the previous question, the US has the highest number of python users (10,110), yet in the graph it appears as though France has the highest instead.在上一个问题的示例 dataframe 中,美国的 python 用户数量最多(10,110),但在图表中,似乎法国的用户数量最多。

Can someone help me fix the solutions code?有人可以帮我修复解决方案代码吗?

Data Frame数据框

Resulting Graph (incorrect)结果图(不正确)

EXAMPLE DATAFRAME:示例 DATAFRAME:

EG 

Language    C     C++     Java    Python    Perl

Country

USA          3222   343     2112   10110      89

France      5432   323     1019     678        789

Japan       7878   467       767     8788       40

INCORRECT CODE:不正确的代码:

from mpl_toolkits.mplot3d import Axes3D

# thickness of the bars
dx, dy = .8, .8

# prepare 3d axes
fig = plt.figure(figsize=(10,6))
ax = Axes3D(fig)

# set up positions for the bars 
xpos=np.arange(eg.shape[0])
ypos=np.arange(eg.shape[1])

# set the ticks in the middle of the bars
ax.set_xticks(xpos + dx/2)
ax.set_yticks(ypos + dy/2)

# create meshgrid 
# print xpos before and after this block if not clear
xpos, ypos = np.meshgrid(xpos, ypos)
xpos = xpos.flatten()
ypos = ypos.flatten()

# the bars starts from 0 attitude
zpos=np.zeros(eg.shape).flatten()

# the bars' heights
dz = eg.values.ravel()

# plot 
ax.bar3d(xpos,ypos,zpos,dx,dy,dz)

# put the column / index labels
ax.w_yaxis.set_ticklabels(eg.columns)
ax.w_xaxis.set_ticklabels(eg.index)

# name the axes
ax.set_xlabel('Country')
ax.set_ylabel('Language')
ax.set_zlabel('Count')

plt.show()

To solve it, just change the ravel part of the code:要解决它,只需更改代码的 ravel 部分:

# the bars' heights
dz = eg.values.ravel(order='F')

That order='F' reads the data correctly for your problem:order='F'为您的问题正确读取数据:

'F' means to index the elements in column-major, Fortran-style order, with the first index changing fastest, and the last index changing slowest. 'F' 表示以列优先、Fortran 样式的顺序索引元素,第一个索引变化最快,最后一个索引变化最慢。

The code you provided does not work as you expected because the xpos and ypos positional arrays are not sorted as the dz array obtained via eg.values.ravel() :您提供的代码无法按预期工作,因为xposypos位置 arrays 未排序为通过eg.values.ravel()获得的dz数组:

eg.values.ravel()
>> array([ 3222,   343,  2112, 10110,    89,  5432,   323,  1019,   678,
         789,  7878,   467,   767,  8788,    40], dtype=int64)

This array (the 'heights' of the chart) concatenates the values of eg 's rows.这个数组(图表的“高度”)连接了eg行的值。 In another terms, dz grabs eg terms in the following order:换句话说, dz按以下顺序获取 eg 项:

(0,0), (0,1), (0,2), (0,3), (1,0)... (0,0), (0,1), (0,2), (0,3), (1,0)...

xpos and ypos , however, are listing values along the columns:但是, xposypos沿列列出了值:

list(zip(xpos, ypos))
>>[(0, 0),(1, 0),(2, 0),(0, 1),(1, 1),(2, 1),(0, 2),...]

So your values get incorrectly assigned.所以你的值被错误地分配了。 For instance, (1,0) - that is, France, C - received the value from (0,1) - USA, C++.例如,(1,0) - 即法国,C - 收到来自 (0,1) - USA, C++ 的值。 That's why the values on the chart are messed up.这就是图表上的值混乱的原因。

Hope it helps!希望能帮助到你!

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM