Cumulative histogram for 2D data in Python

Question

My data consists of a 2-D array of masses and distances. I want to produce a plot where the x-axis is distance and the y axis is the number of data elements with distance <= x (ie a cumulative histogram plot). What is the most efficient way to do this with Python?

PS: the masses are irrelevant since I already have filtered by mass, so all I am trying to produce is a plot using the distance data.

Example plot below:

Answer 1

You can combine numpy.cumsum() and plt.step() :

import matplotlib.pyplot as plt
import numpy as np

N = 15
distances = np.random.uniform(1, 4, 15).cumsum()
counts = np.random.uniform(0.5, 3, 15)
plt.step(distances, counts.cumsum())
plt.show()

Alternatively, plt.bar can be used to draw a histogram, with the widths defined by the difference between successive distances. Optionally, an extra distance needs to be appended to give the last bar a width.

plt.bar(distances, counts.cumsum(), width=np.diff(distances, append=distances[-1]+1), align='edge')
plt.autoscale(enable=True, axis='x', tight=True)  # make x-axis tight

Instead of appending a value, eg a zero could be prepended, depending on the exact interpretation of the data.

plt.bar(distances, counts.cumsum(), width=-np.diff(distances, prepend=0), align='edge')

Answer 2

This is what I figured I can do given a 1D array of data:

plt.figure()
counts = np.ones(len(data))
plt.step(np.sort(data), counts.cumsum())
plt.show()

This apparently works with duplicate elements also, as the ys will be added for each x.

Cumulative histogram for 2D data in Python

Question

2 answers

solution1
2 2020-05-06 13:09:19

solution2
0 ACCPTED 2020-05-06 15:48:52

Cumulative histogram for 2D data in Python

Question

2 answers

solution1 2 2020-05-06 13:09:19

solution2 0 ACCPTED 2020-05-06 15:48:52

solution1
2 2020-05-06 13:09:19

solution2
0 ACCPTED 2020-05-06 15:48:52