简体   繁体   中英

How do I make a scatter plot with these data?

I am trying to make a 2D representation of a 3D data in matplotlib.

I have some data files, for example:

a_1.dat
a_2.dat
a_3.dat

b_1.dat
b_2.dat
b_3.dat

From each data file I can extract the letter, the number, and a parameter associated with the letter-number pair.

I am trying to make a scatter plot where one axis is the range of letters, another axis is the range of numbers, and the scattered points represent the magnitude of the parameter associated with each letter-number pair. I would prefer is this was a 2D plot with a colorbar of some kind, as opposed to a 3D plot.

At this point, I can make a stack of 2d numpy arrays, where each 2d array looks something like

[a 1 val_a1
 a 2 val_a2
 a 3 val_a3]

[b 1 val_b1
 b 2 val_b2
 b 3 val_b3]
  • First question: Is this the best way to store the data for the plot I am trying to make?

  • Second question: How do I make the plot using python (I am most familiar with matplotlib pyplot )?

To be able to fully determine if your way of storing data is correct, you should consider how you use it. If you're using it only want to use it for plotting as described here, then for the sake of the simplicity you can just use three 1D arrays. If, however, you wish to achieve tighter structure, you might consider using a 2D array with custom dtype .

Having this in mind, you can easily create a 2D scatter plot with different colors, where exact color is determined by the value associated with each pair (letter, number).

import numpy as np
from matplotlib import pyplot as plt
from matplotlib import cm

# You might note that in this simple case using numpy for creating array
# was actually unnecessary as simple lists would suffice
letters = np.array(['a', 'a', 'a', 'b', 'b', 'b'])
numbers = np.array([1, 2, 3, 1, 2, 3])
values = np.array([1, 2, 3, 1.5, 3.5, 4.5])

items = len(letters)

# x and y should be numbers, so we first feed it some integers
# Parameter c defines color values and cmap defines color mappings
plt.scatter(xrange(items), numbers, c=values, cmap=cm.jet)

# Now that data is created, we can re-set xticks
plt.xticks(xrange(items), letters)

代码结果

Hopefully, this should be enough for a good start.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM