简体   繁体   English

python matplotlib将xy数据对作为图像绘制

[英]python matplotlib plot xy data pairs as image

I have a large data set with the format x,y,value1,value2.... value# is the value of that variable at the position x, y. 我有一个大型数据集,格式为x,y,value1,value2 .... value#是该变量在x,y位置的值。 The data is read in from a csv file with the xy values being in semi-random order. 从csv文件读取数据,其中xy值按半随机顺序。 The xy values are not on rectilinear grid. xy值不在直线网格上。 I have on the order of millions of data points. 我拥有大约数百万个数据点。

What I would like to do is create an image of the value# variable. 我想做的是创建value#变量的图像。

Is there a built in mechanism for doing this? 有内置的机制可以做到这一点吗? If there is not a built in mechanism, how do I build a two array of the vaule# with the correct ordering. 如果没有内置机制,如何以正确的顺序构建两个vaule#数组。

Do you only have single instances of x AND y? 您只有x和y的单个实例吗? Are all your value#'s of equal length? 您的所有值#是否都相等? If these are the cases it will be a lot easier for you. 如果是这种情况,这对您来说会容易得多。 As far as I know, there is no simple way to tell imshow to do this, but hopefully someone else here knows more about this than I do. 据我所知,没有简单的方法告诉imshow做到这一点,但希望这里的其他人比我更了解这一点。 You might need to restructure the data. 您可能需要重组数据。 I would learn as much as I can about Python's Pandas package if you are wanting to work with large datasets. 如果您要使用大型数据集,那么我将尽可能多地了解Python的Pandas软件包。 Like R, it allows the creation of data frames. 与R一样,它允许创建数据帧。 I think imshow needs your data to be shaped as x by y with your value#'s as your cell values. 我认为imshow需要将您的数据按x乘以y的形状,并将您的value#用作单元格值。 Here is an example for you to follow that uses Pandas. 这是使用熊猫的示例。 There's probably a much more graceful way to go about this, but you should get the point. 可能有一种更为优雅的方法可以解决此问题,但是您应该明白这一点。

import numpy as np
import matplotlib.pyplot as plt
import pandas as pd

df = pd.DataFrame(columns=['x','y','data_value'])
df['x'] = [1,2,1,2]
df['y'] = [1,1,2,2]
df['data_value'] = [1,2,3,4]

print(df) # so you see what's going on

df2 = pd.DataFrame(columns=df['x'].unique(), index = df['y'].unique())

print(df2) # so you see what's going on

# making x columns and y rows
for i in df2.index:
    for j in df2.columns:
        df2.ix[i,j] = (df[(df['y']==i) & (df['x']==j)]['data_value']).values[0]

print(df2)

Oh, and going to plot this (imshow didn't like the ints here) 哦,要作图(imshow不喜欢这里的整数)

plt.imshow(np.array(df2.astype(float)))
plt.show()

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM