简体   繁体   中英

Matplotlib - cumulative density plot with Y-axis as cumulative fraction

I can make a cumulative density plot ( cumulative distribution plots python ):

import numpy as np
import matplotlib.pyplot as plt

# Some fake data:
data = np.random.randn(1000)

sorted_data = np.sort(data)  # Or data.sort(), if data can be modified

# Cumulative counts:
plt.step(np.concatenate([sorted_data, sorted_data[[-1]]]),
         np.arange(sorted_data.size+1))

plt.show()

However, I would like the Y-axis to be represented as the cumulative fraction. Some value between 0 and 1. How can I scale my Y-axis to do this?

Solution

See answer by Ernest below. If using Python 2:

plt.step(np.concatenate([sorted_data, sorted_data[[-1]]]), np.arange(sorted_data.size+1)/float(sorted_data.size))

Not to overly complicate things, just divide by the number of data you have

import numpy as np
import matplotlib.pyplot as plt

# Some fake data:
data = np.random.randn(1000)

sorted_data = np.sort(data)  # Or data.sort(), if data can be modified

# Cumulative counts:
plt.step(np.concatenate([sorted_data, sorted_data[[-1]]]),
         np.arange(sorted_data.size+1)/sorted_data.size)

plt.show()

在此处输入图片说明

In general, you could use min-max scaling by subtracting your values from the minimum and dividing by the difference between the max and min values.

y = np.arange(sorted_data.size+1)
# Using min-max scaling
y = (y - np.min(y)) / (np.max(y) - np.min(y))

Since the minimum of y in this case is 0, that's the same as dividing by the maximum of your y-values.

plt.step(np.concatenate([sorted_data, sorted_data[[-1]]]),
     y / np.max(y))

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM