简体   繁体   English

如何 plot 散点图 plot 这也表示 x 的每个值的 y 值的直方图

[英]How to plot a scatter plot which would also represent the histogram for y value for each value for x

I have a set of X and Y data points (about 20k) that I would like to represent using a scatter plot.我有一组 X 和 Y 数据点(约 20k),我想使用散点图 plot 来表示。

The data set looks something list this数据集看起来有些东西列出了这个

x = [1, 1, 2, 1, 2, 1, 1, 2]

y = [3.1, 3.1, 3.1, 1, 2, 3.1, 1, 2]

(not all values are integers in the data actual data set) (并非所有值都是数据实际数据集中的整数)

I would like to make a scatter plot with color of where the color would indicate the frequency of a particular value in 'y' for a particular 'x'我想制作一个散点图 plot 的颜色,其中颜色表示特定“x”在“y”中的特定值的频率

For this I tried to calculate the histogram of y for each x value but I always end up with a plot which is wrong.为此,我尝试计算每个 x 值的 y 直方图,但我总是以 plot 结束,这是错误的。 the codes I use are shown below我使用的代码如下所示

 x = [1, 1, 2, 1, 2, 1, 1, 2]
    
 y = [3.1, 3.1, 3.1, 1, 2, 3.1, 1, 2]
    
 I = []
    
 Y = []
    
 C = []
    
 for i in range (0, len(x)):

    if x[i] not in I :
    
        I.append(x[i])
    
        for j in range (0, len(x)):
    
            if x[i] == x[j]:
    
                Y.append(y[j])
    
                u,c = np.unique(Y, return_counts=True)
    
                C.append(c)
    
                Y = []
             
plt.scatter(x, y, s=70, c=C, cmap='RdYlBu', marker='o', edgecolors='black', linewidth=1, alpha=7)


plt.xlabel('x')

plt.ylabel('y')

plt.colorbar()

the final plot looks like this final plot最终的 plot 看起来像这个最终的 plot

It will be really helpful if someone could tell me where I'm making a mistake or how could I possibly achieve this.如果有人能告诉我我在哪里犯了错误,或者我怎么可能做到这一点,那将非常有帮助。 I'm very new to python so more explanation is appreciated.我对 python 非常陌生,因此不胜感激。

Thank you in advance.先感谢您。 (also will it be possible to make the dot having the same value appear repeatedly with the same color?) (也可以使具有相同值的点以相同的颜色重复出现吗?)

Here is a code that works for you:这是一个适合您的代码:

import numpy as np
import matplotlib.pyplot as plt 

x = np.array([1, 1, 2, 1, 2, 1, 1, 2])

y = np.array([3.1, 3.1, 3.1, 1, 2, 3.1, 1, 2])
X=[]
Y=[]
C=[]

for i in np.unique(x):
    new_y = y[np.where(x==i)]
    unique,count = np.unique(new_y, return_counts=True)

    for j in range(len(unique)):
        X.append(i)
        Y.append(unique[j])
        C.append(count[j])

plt.scatter(X,Y,c=C)
plt.colorbar()

What I do is that for each "unique" value of x I check the values of y using the build in numpy function where .我所做的是,对于 x 的每个“唯一”值,我使用 numpy function 中的构建来检查 y 的值,其中. Then my count is not much different from yours.那么我的计数和你的相差不大。

Here is the result:结果如下:

结果

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM