简体   繁体   中英

pandas.plotting scatter_matrix confusion about main diagonal plots

I am a bit confused about how scatter_matrix in the pandas.plotting module works. eg, see the plot here https://www.geeksforgeeks.org/pair-plots-using-scatter-matrix-in-pandas/

The 3 plots along the main diagonal looks like distributions. But the y and x axis labels indicate it's plotting a variable vs. itself, so shouldn't it be a straight line? Where did the distribution come from?

By default pandas.plotting.scatter_matrix plots histograms on the diagonal. Each histogram shows the counts of just that column of data. Otherwise, as you mentioned, we'd only have (useless) straight lines on the diagonal.

There is a diagonal param to choose between a histogram or kernel density:

pandas.plotting.scatter_matrix(frame, alpha=0.5, figsize=None, ax=None, grid=False, diagonal='hist', marker='.', density_kwds=None, hist_kwds=None, range_padding=0.05, **kwargs)

...

diagonal{'hist', 'kde'} : Pick between 'kde' and 'hist' for either Kernel Density Estimation or Histogram plot in the diagonal.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM