I have a (x,y) dataset, and I would like to calculate the r_value**2
for every 10 elements (so between element 0 and 9, between 1 and 10, ..., between n-10 and n).
Ideally the code should give out the r_value**2_max
and save all r
-values in a list. I've made a loop, but don't know how to tell stats.linregress
to look between test_i
and test_i+10
and save all r-values**2
in a list.
So far, I have this:
import matplotlib.pyplot as plt
from scipy import stats
import numpy as np
import csv
path = '/storage/.../01_python_in/'
test = np.loadtxt(path + 'sample_data.txt', skiprows=0)
test_min = 0
test_max = len(test)
for test_i in range(test_min, test_max-10):
slope, intercept, r_value, p_value, std_err = stats.linregress(test[:, 0], test[:, 1])
print 'i:', test_i, 'r**2:', r_value**2
The way to manually implement this is to slice the first dimension of your array from test_i
to test_i + 10
, like this:
linregress(test[test_i:test_i+window, 0], test[test_i:test_i+window, 1])
Actually, you don't have to split apart the x
and y
parts for linregress
:
linregress(test[test_i:test_i+window])
You could also save the r_values
by building a list in your loop. This, along with the above is shown here:
window = 10
r_values = []
for test_i in range(len(test)-window):
slope, intercept, r_value, p_value, std_err = \
stats.linregress(test[test_i:test_i + window])
r_values.append(r_value)
print 'i:', test_i, 'r**2:', r_value**2
It's actually simple enough for a list comprehension:
r_values = [stats.linregress(test[i:i+w]).rvalue for i in range(len(test)-w)]
You can get the squares then with:
r_values = np.asarray(r_values)
r_values2 = r_values**2
And the max i
with:
max_i = np.argmax(r_values2)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.