简体   繁体   English

理解Pandas散射矩阵图中的对角线

[英]Understanding the diagonal in Pandas' scatter matrix plot

I'm plotting a scatter plot with Pandas . 我正在用Pandas绘制散点图。 I can understand the plot, except the curves in diagonal plots. 我可以理解该情节,除了对角线图中的曲线。 Can someone explain to me what they mean? 有人可以向我解释他们的意思吗?

Image: 图片:

在此输入图像描述

Code: 码:

import pylab
import numpy as np
from pandas.tools.plotting import scatter_matrix
import pandas as pd

def make_scatter_plot(X, name):    
    """
    Make scatterplot.

    Parameters:
    -----------
    X:a design matrix where each column is a feature and each row is an observation.
    name: the name of the plot.
    """
    pylab.clf()
    df = pd.DataFrame(X)
    axs = scatter_matrix(df, alpha=0.2, diagonal='kde')

    for ax in axs[:,0]: # the left boundary
        ax.grid('off', axis='both')
        ax.set_yticks([0, .5])

    for ax in axs[-1,:]: # the lower boundary
        ax.grid('off', axis='both')
        ax.set_xticks([0, .5])

    pylab.savefig(name + ".png")

As you can tell, the scatter matrix is plotting each of the columns specified against each other column. 如您所知,散点矩阵正在绘制针对每个其他列指定的每个列。

However, in this format, when you got to a diagonal, you would see a plot of a column against itself. 但是,在这种格式中,当你到达对角线时,你会看到一个列对着自己的图。 Since this would always be a straight line, Pandas decides it can give you more useful information, and plots the density plot of just that column of data. 由于这总是一条直线,Pandas决定它可以为您提供更多有用的信息,并绘制该列数据的密度图。

See http://pandas.pydata.org/pandas-docs/stable/visualization.html#density-plot . 请参见http://pandas.pydata.org/pandas-docs/stable/visualization.html#density-plot

If you would rather have a histogram, you could change your plotting code to: 如果您想要直方图,可以将绘图代码更改为:

axs = scatter_matrix(df, alpha=0.2, diagonal='hist')

Plotting methods allow for a handful of plot styles other than the default Line plot. 绘图方法允许除默认线图之外的少数绘图样式。 These methods can be provided as the kind keyword argument to plot(). 这些方法可以作为plot()的kind关键字参数提供。 These include: 这些包括:

  • 'bar' or 'barh' for bar plots 条形图的'bar'或'barh'
  • 'hist' for histogram 直方图'hist'
  • 'box' for boxplot boxplot的'box'
  • 'kde' or 'density' for density plots 密度图的'kde'或'密度'
  • 'area' for area plots 面积图的“区域”
  • 'scatter' for scatter plots 散点图的“散射”
  • 'hexbin' for hexagonal bin plots 'hexbin'表示六边形bin图
  • 'pie' for pie plots 饼图的'馅饼'

https://pandas.pydata.org/pandas-docs/stable/visualization.html https://pandas.pydata.org/pandas-docs/stable/visualization.html

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM