4D數據的Python散點圖

Question

我有要散點圖的4D數據數組。 可以將數據視為兩個附加參數的每對值的x坐標和y坐標。

我想將圖“展平”為2D散點圖，其中兩個額外參數用不同的顏色表示，例如，兩個參數中每對的顏色。 或者，我希望僅為幾個參數對繪制的點看起來較亮，而為許多參數對繪制的點看起來較重/較暗。 也許可以通過在彼此頂部“堆疊”一些半透明的點來實現？

有一些標准的方法可以在Python中執行此操作，例如使用matplotlib嗎？

Answer 1

我嘗試了將半透明散點圖彼此“堆疊”的建議方法：

import numpy as np
import matplotlib.pyplot as plt

for ii in xrange(len(param1)):
    for jj in xrange(len(param2)):
        delta_idx, rho_idx = np.where(data1[:,:,ii,jj] < data2[:,:,ii,jj])
        plt.scatter(delta_idx, rho_idx, marker = 'o', c = 'k', alpha = 0.01)
plt.xlabel('$\delta$')
plt.ylabel('$\rho$')
plt.show()

我在問題中描述的二維點實際上是對data1中的值小於data2的對應值的標識。 這產生了以下情節： 堆積散點圖

可以做很多事情來改善情節，但是我對它的外觀並不真正滿意，因此我嘗試了另一種方法。 無論如何，我都會在這里發布此消息，以防有人覺得有用。

Answer 2

作為“堆積”散點圖的替代方法，我嘗試首先在2D“出現圖”中累積data1 < data2出現。 然后，我使用pcolormesh （從prettyplotlib導入以使其看起來更好）來繪制此地圖：

import prettyplotlib as ppl
import numpy as np

occurrence_map = np.sum(data1 < data2, axis=(2,3), dtype=float) / np.prod(data1.shape[2:])
ppl.pcolormesh(occurrence_map2, vmin=0, vmax=1)

歸一化是為了產生相對的出現度量，即，在參數對（ data1和data2兩個最后維度）的一小部分中data1 < data2 ？ 然后將該圖配置為從0到1的顏色值。這將產生以下圖，我對此感到非常滿意：

相對事件的pcolormesh圖

Answer 3

關於散點圖矩陣的評論也啟發了我嘗試類似的方法。 散點圖矩陣並不是我想要的，但我從@tisimst的 @ lbn-plus-1建議的答案中獲取了代碼，並對其進行了如下修改：

import itertools
import numpy as np
import matplotlib.pyplot as plt

def scatterplot_matrix(data, names=[], **kwargs):
    """Plots a pcolormesh matrix of subplots.  The two first dimensions of
    data are plotted as a mesh of values, one for each of the two last
    dimensions of data. Data must thus be four-dimensional and results
    in a matrix of pcolormesh plots with the number of rows equal to
    the size of the third dimension of data and number of columns
    equal to the size of the fourth dimension of data. Additional
    keyword arguments are passed on to matplotlib\'s \"pcolormesh\"
    command. Returns the matplotlib figure object containg the subplot
    grid.
    """
    assert data.ndim == 4, 'data must be 4-dimensional.'
    datashape = data.shape
    fig, axes = plt.subplots(nrows=datashape[2], ncols=datashape[3], figsize=(8,8))
    fig.subplots_adjust(hspace=0.0, wspace=0.0)

    for ax in axes.flat:
        # Hide all ticks and labels
        ax.xaxis.set_visible(False)
        ax.yaxis.set_visible(False)

        # Set up ticks only on one side for the "edge" subplots...
        if ax.is_first_col():
            ax.yaxis.set_ticks_position('left')
        if ax.is_last_col():
            ax.yaxis.set_ticks_position('right')
        if ax.is_first_row():
            ax.xaxis.set_ticks_position('top')
        if ax.is_last_row():
            ax.xaxis.set_ticks_position('bottom')

    # Plot the data.
    for ii in xrange(datashape[2]):
        for jj in xrange(datashape[3]):
            axes[ii,jj].pcolormesh(data[:,:,ii,jj], **kwargs)

    # Label the diagonal subplots...
    #if not names:
    #    names = ['x'+str(i) for i in range(numvars)]
    # 
    #for i, label in enumerate(names):
    #    axes[i,i].annotate(label, (0.5, 0.5), xycoords='axes fraction',
    #            ha='center', va='center')

    # Turn on the proper x or y axes ticks.
    #for i, j in zip(range(numvars), itertools.cycle((-1, 0))):
    #    axes[j,i].xaxis.set_visible(True)
    #    axes[i,j].yaxis.set_visible(True)

    # FIX #2: if numvars is odd, the bottom right corner plot doesn't have the
    # correct axes limits, so we pull them from other axes
    #if numvars%2:
    #    xlimits = axes[0,-1].get_xlim()
    #    ylimits = axes[-1,0].get_ylim()
    #    axes[-1,-1].set_xlim(xlimits)
    #    axes[-1,-1].set_ylim(ylimits)

    return fig

if __name__=='__main__':
    np.random.seed(1977)
    data = np.random.random([10] * 4)
    fig = scatterplot_matrix(data,
            linestyle='none', marker='o', color='black', mfc='none')
    fig.suptitle('Simple Scatterplot Matrix')
    plt.show()

我將上述模塊另存為datamatrix.py，並按以下方式使用它：

import datamatrix
import brewer2mpl

colors = brewer2mpl.get_map('RdBu', 'Diverging', 11).mpl_colormap
indicator = np.ma.masked_invalid(-np.sign(data1 - data2)) # Negated because the 'RdBu' colormap is the wrong way around
fig = datamatrix.scatterplot_matrix(indicator, cmap = colors)
plt.show()

可以brewer2mpl和顏色圖的東西-那只是我在brewer2mpl一些顏色。 結果如下圖：

單個參數值出現的pcolormesh圖的矩陣

矩陣的“外部”維是兩個參數（ data1和data2的最后兩個維）。 矩陣內部的每個pmeshcolor繪圖都是一個類似於此答案的“出現圖”，但是對於每個參數對都是二進制的。 一些圖的底部的白線是相等的區域。 右上角的每個白點是數據中的nan值。

4D數據的Python散點圖

問題描述

3 個解決方案

解決方案1
0 2014-07-09 11:56:37

解決方案2
0 2014-07-09 12:05:12

解決方案3
0 2014-07-09 12:20:54

4D數據的Python散點圖

問題描述

3 個解決方案

解決方案1 0 2014-07-09 11:56:37

解決方案2 0 2014-07-09 12:05:12

解決方案3 0 2014-07-09 12:20:54

解決方案1
0 2014-07-09 11:56:37

解決方案2
0 2014-07-09 12:05:12

解決方案3
0 2014-07-09 12:20:54